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DRUG TARGETS IN CANDIDA ALBICANS 



The present invention is concerned with the 
identification of genes or functional fragments 
thereof from Candida albicans which are critical for 
growth and cell division and which genes may be used 
as selective drug targets to treat Candida albicans 
associated infections • Novel nucleic acid sequences 
from Candida albicans are also provided and which 
encode the polypeptides which are critical for growth 
of Candida albicans. 

Opportunistic infections in immunocompromised 
hosts represent an increasingly common cause of 
mortality and morbidity • Candida species are among 
the most commonly identified fungal pathogens 
associated with such opportunistic infections, with 
Candida albicans being the most common species. Such 
fungal infections are thus problematical in, for 
example, AIDS populations in addition to normal 
healthy women where Candida albicans yeasts represent 
the most common cause of vulvovaginitis • 

Although compounds do exist for treating such 
disorders, such as for example, amphotericin, these 
drugs are generally limited in their treatment because 
of their toxicity and side effects. Therefore, there 
exists a need for new compounds which may be used to 
treat Candida associated infections in addition to 
compounds which are selective in their action against 
Candida albicans. 

Classical approaches for identifying anti-fungal 
compounds have relied almost exclusively on inhibition 
of fungal or yeast growth as an endpoint. Libraries 
of natural products, semi-synthetic, or synthetic 
chemicals are screened for their ability to kill or 
arrest growth of the target pathogen or a related 
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nonpathogenic model organisia. These tests are 
cumbersome and provide no information about a 
compounds mechanism of action. The promising lead 
compounds that emerge from such screens must then be 
5 tested for possible host-toxicity and detailed 
mechanism of action studies must subsequently be 
conducted to identify the affected molecular target. 

The present inventors have now identified a range 
of nucleic acid sequences from Candida albicans which 

10 encode polypeptides which are critical for its 

survival and growth. These sequences represent novel 
targets which can be incorporated into an assay to 
selectively identify compounds capable of inhibiting 
expression of such polypeptides and their potential 

15 use in alleviating diseases or conditions associated 
with Candida albicans infection. 

Therefore, according to a first aspect of the 
invention there is provided a nucleic acid molecule 
encoding a polypeptide which is critical for survival 

20 and growth of the yeast Candida albicans and which 

nucleic acid molecule comprises any of the sequences 
of nucleotides illustrated in Figures 1, 2, 4 to 7, 9 
to 11, 13, 15 to 20, 22 to 26, 28 to 32, 34 to 43, 45a 
and b, 47 to 49, 51, 52, 53 to 57, 59 and 60. 

25 A further aspect of the invention comprises a 

nucleic acid molecule encoding a polypeptide which is 
critical for survival and growth of the yeast Candida 
albicans and which nucleic acid molecule comprises any 
of the sequences of nucleotides illustrated in Figure 

30 1, 2, 36, 37a and b, 38, 39 and 40 and fragments or 
derivatives of said nucleic acid molecules. 

Letters utilised in the sequences according to 
the invention which are not recognisable as letters of 
the genetic code signify a position in the nucleic 

35 acid sequence where one or more of bases A, G, C or T 
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can occupy the nucleotide position. Representative 
letters used to identify the range of bases which can 
be used are as follows: 



M: 


A 


or 


c 






R: 


A 


or 


G 






W: 


A 


or 


T 






S: 


C 


or 


G 






Y: 


C 


or 


T 






K: 


G 


or 


T 






V: 


A 


or 


C 


or 


G 


H: 


A 


or 


C 


or 


T 


D: 


A 


or 


G 


or 


T 


B: 


C 


or 


G 


or 


T 


N: 


G 


or 


A 


or 


T or C 



In one embodiment of each of the above identified 
aspects of the invention the nucleic acid may comprise 
a mRNA molecule or alternatively a DNA and preferably 
20 a cDNA molecule. 

Also provided by the present invention is a 
nucleic acid molecule capable of hybridising to the 
nucleic acid molecules illustrated in any of Figures l 
to 61 under high stringency conditions. 
25 Stringency of hybridisation as used herein refers 

to conditions under which polynucleic acids are 
stable. The stability of hybrids is reflected in the 
melting temperature (Tm) of the hybrids. Tm can be 
approximated by the formula: 

30 

81.5*C+16.6(log^Q[Na*]+0.41 (%G&C) -6001/1 

wherein 1 is the length of the hybrids in nucleotides. 
Tm decreases approximately by l-l.5*C with every 1% 
35 decrease in sequence homology. 
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The nucleic acid capable of hybridising to 
nucleic acid molecules according to the invention will 
generally be at least 70%, preferably at least 80 or 
90% and laore preferably at least 95% homologous to the 
5 nucleotide sequences illustrated in any of Figures 1 
to 61. 

The DNA molecules according to the invention may, 
advantageously, be included in a suitable expression 
vector to express polypeptides encoded therefrom in a 

10 suitable host. 

An expression vector according to the invention 
includes a vector having a nucleic acid according to 
the invention operably linked to regulatory sequences, 
such as promoter regions, that are capable of 

15 effecting expression of said DNA fragments. The term 
"operably linked* refers to a juxta position wherein 
the components described are in a relationship 
permitting them to function in their intended manner. 
Such vectors may be transformed into a suitable host 

20 cell to provide for expression of a polypeptide 
according to the invention. Thus, in a further 
aspect, the invention provides a process for preparing 
polypeptides according to the invention which 
comprises cultivating a host cell, transformed or 

25 transfected with an expression vector as described 
above under conditions to provide for expression by 
the vector of a coding sequence encoding the 
polypeptides, and recovering the expressed 
polypeptides. 

30 The vectors may be, for example, plasmid, virus 

or phage vectors provided with an origin of 
replication, optionally a promoter for the expression 
of said nucleotide and optionally a regulator of the 
promoter. The vectors may contain one or more 

3 5 selectable markers, such as, for example, ampicillin 



resistance. 

Polynucleotides according to the invention may be 
inserted into the vectors described in an antisense 
orientation in order to provide for the production of 
antisense RNA. Antisense RNA or other antisense 
nucleic acids may be produced by synthetic means. 

In accordance with the present invention, a 
defined nucleic acid includes not only the identical 
nucleic acid but also any minor base variations 
including in particular, substitutions in bases which 
result in a synonymous codon (a different codon 
specifying the same amino acid residue) due to the 
degenerate code in conservative amino acid 
substitutions. The term "nucleic acid sequence" also 
includes the complementary sequence to any single 
stranded sequence given regarding base variations. 

The present invention also comprises within its 
scope proteins or polypeptides expressed by the 
nucleic acid molecules according to the invention or a 
functional equivalent, derivative or bioprecursor 
thereof . 

The present invention also advantageously 
provides nucleic acid sequences of at least 
approximately 15 contiguous nucleotides of a nucleic 
acid according to the invention and preferably from 15 
to 50 nucleotides. These sequences may, 
advantageously be used as probes or primers to 
initiate replication, or the like. Such nucleic acid 
sequences may be produced according to techniques well 
known in the art, such as by recombinant or synthetic 
means. They may also be used in diagnostic kits or 
the like for detecting the presence of a nucleic acid 
according to the invention. These tests generally 
comprise contacting the probe with the sample under 
hybridising conditions and detecting for the presence 
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of any duplex or triplex formation between the probe 
and any nucleic acid in the sample. 

Advantageously, the nucleic acid sequences, 
according to the invention may be produced using such 
5 recombinant or synthetic means, such as for example 
using PCR cloning mechanisms which generally involve 
making a pair of primers?, which may be from 
approximately 15 to 50 nucleotides to a region of the 
gene which is desired to be cloned, bringing the 

10 primers into contact with mRNA, cDNA, or genomic DNA 
from a human cell, performing a polymerase chain 
reaction under conditions which bring about 
amplification of the desired region, isolated the 
amplified region or fragment and recovering the 

15 amplified DNA. Generally, such techniques as defined 
herein are well known in the art, such as described in 
Sambrook et ai (Molecular Cloning: a Laboratory 
Manual, 1989). 

The nucleic acids or oligonucleotides according 

20 to the invention may carry a revealing label. 

Suitable labels include radioisotopes such as '^P or 
^'s, enzyme labels or other protein labels such as 
biotin or fluorescent markers. Such labels may be 
added to the nucleic acids or oligonucleotides of the 

25 invention and may be detected using known techniques 
per se. 

The polypeptide or protein according to the 
invention includes all possible amino acid variants 
encoded by the nucleic acid molecule according to the 

30 invention including a polypeptide encoded by said 

molecule and having conservative amino acid changes. 
Polypeptides according to the invention further 
include variants of such sequences, including 
naturally occurring allelic variants which are 

35 substantially homologous to said polypeptides. In 



this context, substantial homology is regarded as a 
sequence which has at least 70%, preferably 80 or 90% 
amino acid homology with the polypeptides encoded by 
he nucleic acid molecules according to the invention • 

Nucleic acids and polypeptides which are 
particularly preferred are those comprising the 
sequences of nucleotides illustrated in Figures 1 and 
2. These sequences are specific to Candida albicans 
with no functionally related sequences in other 
prokaryotic or eukaryotic organism as yet identified 
from the respective genomic databases. 

Nucleotide sequences according to the invention 
are particularly advantageous for selective 
therapeutic targets for treating Candida albicans 
associated infections. For example, an antisense 
nucleic acid capable of binding to the nucleic acid 
sequence illustrated in any of Figures 1 to 61 may be 
used to selectively inhibit expression of the 
corresponding polypeptides, leading to impaired growth 
of the Candida albicans with reductions of associated 
illnesses or diseases. The antisense nucleic acid 
corresponding to the sequences identified in Figures 1 
and 2 may therefore be particularly useful in 
selective treatment of Candida albicans associated 
infection. 

The nucleic acid molecule or the polypeptide 
according to the invention may be used as a 
medicament, or in the preparation of a medicament, for 
treating diseases or conditions associated with 
Candida albicans infection. 

Advantageously, the nucleic acid molecule or the 
polypeptide according to the invention may be provided 
in a pharmaceutical composition together with a 
pharmaceutical ly acceptable carrier, diluent or 
excipient therefor. 



Antibodies to the protein or polypeptide of the 
present invention may, advantageously, be prepared by 
techniques which are known in the art. For example, 
polyclonal antibodies may be prepared by inoculating a 
host animal, such as a mouse, with the polypeptide 
according to the invention or an epitope thereof and 
recovering immune serum*. Monoclonal antibodies may be 
prepared according to known techniques such as 
described by Kohler R. and Milstein C, Nature (1975) 
256, 495-497. 

Antibodies according to the invention may also be 
used in a method of detecting for the presence of a 
polypeptide according to the invention, which method 
comprises reacting the antibody with a sample and 
identifying any protein bound to said antibody. A kit 
may also be provided for performing said method which 
comprises an antibody according to the invention and 
means for reacting the antibody with said sample. 

Proteins which interact with the polypeptide of 
the invention may be identified by investigating 
protein-protein interactions using the two-hybrid 
vector system first proposed by Chien et al (1991) . 

This technique is based on functional 
reconstitution in vivo of a transcription factor which 
activates a reporter gene. More particularly the 
technique comprises providing an appropriate host cell 
with a DNA construct comprising a reporter gene under 
the control of a promoter regulated by a transcription 
factor having a DNA binding domain and an activating 
domain, expressing in the host cell a first hybrid DNA 
sequence encoding a first fusion of a fragment or all 
of a nucleic acid sequence according to the invention 
and either said DNA binding domain or said activating 
domain of the transcription factor, expressing in the 
host at least one second hybrid DNA sequence, such as 



a library or the like, encoding putative binding 
proteins to be investigated together with the DNA 
binding or activating domain of the transcription 
factor which is not incorporated in the first fusion; 
detecting any binding of the proteins to be 
investigated with a protein according to the invention 
by detecting for the presence of any reporter gene 
product in the host cell; optionally isolating second 
hybrid DNA sequences encoding the binding protein. 

An example of such a technique utilises the GAL4 
protein in yeast. GAL4 is a transcriptional activator 
of galactose metabolism in yeast and has a separate 
domain for binding to activators upstream of the 
galactose metabolising genes as well as a protein 
binding domain. Nucleotide vectors may be 
constructed, one of which comprises the nucleotide 
residues encoding the DNA binding domain of GAL4. 
These binding domain residues may be fused to a known 
protein encoding sequence, such as for example the 
nucleic acids according to the invention. The other 
vector comprises the residues encoding the protein 
binding domain of GAL4. These residues are fused to 
residues encoding a test protein. Any interaction 
between polypeptides encoded by the nucleic acid 
according to the invention and the protein to be 
tested leads to transcriptional activation of a 
reporter molecule in a GAL-4 transcription deficient 
yeast cell into which the vectors have been 
transformed. Preferably, a reporter molecule such as 
B-galactosidase is activated upon restoration of 
transcription of the yeast galactose metabolism genes. 

Further provided by the present invention is one 
or more Candida albicans cells comprising an induced 
mutation in the DNA sequence encoding the polypeptide 
according to the invention. 
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A further aspect of the invention provides a 
method of identifying compounds which selectively 
inhibit expression of polypeptides expressed from the 
nucleotides sequences illustrated in any of Figures 1 
5 to 61 and which are critical for growth and survival 
of Candida albicans, which method comprises (a) 
contacting a compound to be tested with one or more 
Candida albicans cells having a mutation in a nucleic 
acid molecule according to the invention which 
10 mutation results in over express ion or underexpression 
of said polypeptides in addition to one or more wild 
type Candida cells, (b) monitoring the growth and/or 
activity of said mutated cell compared to said wild 
type wherein differential growth or activity of said 
15 one or more mutated Candida cells provides an 

indication of selective action of said compound on 
said polypeptide or another polypeptide in the same or 
a parallel pathway. 

Compounds identifiable or identified using the 
20 method according to the invention, may advantageously 
be used as a medicament, or in the preparation of a 
medicament to treat diseases or conditions associated 
with Candida albicans infection. These compounds may 
also advantageously be included in a pharmaceutical 
25 composition together with a pharmaceutically 

acceptable carrier, diluent or excipient therefor. 

A further aspect of the invention provides a 
method of identifying DNA sequences from a cell or 
organism which DNA encodes polypeptides which are 
30 critical for growth or survival, which method 

comprises (a) preparing a cDNA or genomic library from 
said cell or organism in a suitable expression vector 
which vector is such that it can either integrate into 
the genome in said cell or that it permits 
35. transcription of antisense RNA from the nucleotide 



sequences in said cDNA or genomic library, (b) 
selecting trans forma nts exhibiting impaired growth and 
determining the nucleotide sequence of the cDNA or 
genomic sequence from the library included in the 
vector from said transformant. Preferably, the cell 
or organism may be any yeast or filamentous fungi, 
such as, for example, Saccharomyces cervisiae, 
Saccharomyces pombe or Candida albicans. 

A further aspect of the invention provides a 
pharmaceutical composition comprising a compound 
according to the invention together with a 
pharmaceutically acceptable carrier, diluent or 
excipient therefor. 

The present invention may be more clearly 
understood with reference to the accompanying example, 
which is purely exemplary, with reference to the 
accompanying drawings, wherein 



Figures 1 & 2: are nucleotide sequences of 

previously unknown function 
isolated from Candida albicans and 
which sequences are not present in 
the public domain. 



Figures 3 to 35: are nucleotide sequences of 

previously unknown function 
isolated from Candida albicans and 
which sequences are partially or 
fully present in the public 
domain. 

Figures 36 to 40: are nucleotide sequences isolated 

from Candida albicans and which 
have an identified function based 
on sequence homology with proteins 
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from other organisms and which 
sequences are not present in the 
public domain. 



Figures 41 to 61: 



10 



are nucleotide sequences having an 
identified function based on 
sequence homology comparisons from 
other organisms and which 
sequences are fully or partially 
present in the public domain. 



Figure 62: 



is a diagrammatic representation 
of plasmid pGALlPNiST-1. 



15 



Figure 63; 



is a nucleotide sequence of 
plasmid pGALlPNiST-1 of Figure 62, 



20 



25 



Figure 64; 



Figure 65: 



Figures 66 to 106: 



is a diagrammatic representation 
of plasmid pGALlPSiST-1. 

is a nucleotide sequence of 
plasmid pGALlPSiST-1 of Figure 64. 

are amino acid sequences of the 
appropriately corresponding DNA 
sequences illustrated in Figures 1 
to 61. 



30 Example 1 

Identification of novel drug targets in C. 
albicans by anti-sense and disruptive integration 

The principle of the approach is based on the 
fact that when a particular C. albicans mRNA is 
35 inhibited by producing the complementary anti-sense 
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RNA, the corresponding protein will decrease. If this 
protein is critical for growth or survival, the cell 
producing the anti-sense RNA will grow more slowly or 
will die. 

5 Since anti-sense inhibition occurs at mRNA level, 

the gene copy number is irrelevant, thus allowing 
applications of the strategy even in diploid 
organisms. 

Anti-sense RNA is endogenously produced from an 
10 integrative or episomal plasmid with an inducible 
promoter; induction of the promoter leads to the 
production of a RNA encoded by the insert of the 
plasmid. This insert will differ from one plasmid to 
another in the library. The inserts will be derived 
15 from genomic DNA fragments or from cDNA to cover-to 
the extent possible- the entire genome. 

The vector is a proprietary vector allowing 
integration by homologous recombination at either the 
homologous insert or promoter sequence in the Candida 

20 genome. After introducing plasmids from cDNA or 

genomic libraries into C. albicans, transformants are 
screened for impaired growth after promoter (& thus 
anti-sense) induction in the presence of lithium 
acetate. Lithium acetate prolongs the Gl phase and 

25 thus allows anti-sense to act during a prolonged 

period of time during the cell cycle. Transformants 
which show impaired growth in both induced and non- 
induced media, thus showing a growth defect due to 
integrative disruption, are selected as well. 

30 Transformants showing impaired growth are 

supposed to contain plasmids which produce anti-sense 
RNA to mRNAs critical for growth or survival. Growth 
is monitored by measuring growth-curves over a period 
of time in a device (Bioscreen Analyzer, Labsystems) 

35 Which allows simultaneous measurement of growth-curves 
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of 200 transfontiants. 

Subsequently plasmlds can be recovered from the 
trans formants and the sequence of their inserts 
determined, thus revealing which mRNA they inhibit. In 
5 order to be able to recover the genomic or cDNA insert 
which has integrated into the Candida genome, genomic 
DNA is isolated, cut with an enzyme which cuts only 
once into the library vector (and estimated approx* 
every 4096 bp in the genome) and religated. PGR with 

10 primers flanking the insert will yield (partial) 

genomic or cDNA inserts as PGR fragments which can 
directly be sequenced. This PGR analysis (on ligation 
reaction) will also show us how many integrations 
occurred. Alternatively the ligation reaction is 

15 transformed to E. coli and PGR analysis is performed 
on colonies or on plasmid DNA derived thereof. 

This method is employed for a genome wide search 
for novel C. alJbicans genes which are important for 
growth or survival. 

20 

Materials & Methods 
Construction of p6allPMiST-l 
The backbone of the pGALlPNiST-1 vector 
(integrative anti-sense Sfil-Notl vector) is 

25 pGEMllZf(+) (Promega Inc.). First, the CaMAL2 

EcoRI/Sall promoter fragment from pDBVSO (D.H. Brown 
et al.) was ligated into EcoRI/SaJI-opened 
pGEMllZf (+) resulting in the intermediate construct 
PGEMMAL2P-1. Into the latter (WscI/GIP) the GaURA3 

30 selection marker was cloned as a Eco^llll/Xmnl 

fragment derived from pRM2. The resulting pGEMMAL2P-2 
vector was Notl/Hindlll opened in order to accept the 
WotI-stuffer-5/iI cassette from pPCKlNiSGYCT-1 
(J?agrI//fi;3dIII fragment): pMAL2PNiST-l. Finally, the 

35 plasmid pGALlPNiST-1 was constructed by exchanging the 
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SalI/Ecll3Sll MAL2 promoter in pMAL2PNiST-l by the 
Xhol/Smal GALl promoter fragment derived from 
PRM2GAL1P, 

Construction of pGallPSiST-l 

The vector pGALlPSiST-1 was created for cloning 
the small genomic DNA fragments (flanked by Sfil 
sites) behind the GALl promoter. The only difference 
with pGALlPNiST-1 is that the hlFN3 ( stuff er fragment) 
insert fragment in pGALlPSiST-1 is flanked by two Sfil 
sites in stead of a Sfil and a NotI site as in 
pGALlPNiST-l. To construct pGALlPSiST-1 the ^coRI- 
Hindlll fragment, containing hIFN3 flanked by a Sfil 
and a Wot I site, of pMAL2pHiET-3 (unpublished) was 
exchanged by the rcoRI-Hlndlll fragment, containing 
hlFNP flanked by two sites, from YCp50S-S (an E. 

coli / 5. cerevisiae shuttle vector derived from the 
plasmid YCpSO, which is deposited in the ATCC 
collection (number 37419; Thrash et al., 1985); an 
EcoRl^Hindlll fragment, containing the gene hlFNP, 
which is flanked by two Sfil sites, was inserted in 
YCp50, creating YCp50S-S) , resulting into plasmid 
pMAL2PSiST-l. The mal2 promoter from pMAL2PSiST-l (by 
a Nael'Fspl digest) was further replaced by the gall 
promoter from pGALlPNiST-1 (via a Xhol-Sall digest) , 
creating the vector pGALlPSiST-l. 

Candida albicans genomic library 

* Preparation of the genomic DNA fragments 
A Candida albicans genomic DNA library with small DNA 
fragments (400 to 1,000 bp) was prepared* Genomic DNA 
of Candida albicans B2630 was isolated following a 
modified protocol of Blin and Stafford (1976). The 
quality of the isolated genomic DNA was checked by gel 
electrophoresis. Undigested DNA was located on the gel 
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above the marker band of 26,282 bp. A little smear, 
caused by fragmentation of the DNA, was present. 
To obtain enrichment for genomic DNA fragments of the 
desired size, the genomic DNA was partially digested. 
5 Several restriction enzymes (Alul, Haelll and i^sal; 
all creating blunt ends) were tried out. The 
appropriate digest conditions have been determined by 
titration of the enzyme. Enrichment of small DNA 
fragments was obtained with 70 units of Alul on 10 /xg 

10 of genomic DNA for 20 min. T4 DNA polymerase 

(Boehringer) and dNTPs (Boehringer) were added to 
polish the DNA ends. After extraction with phenol- 
chloroform the digest was size-fractionated on an 
agarose gel. The genomic DNA fragments with a length 

15 of 500 to 1,250 bp were eluted from the gel by 
centrifugal filtration (Zhu et al., 1985). Sfil 
adaptors (5' GTTGGCCTTTT) or (5' AGGCCAAC) were 
attached to the DNA ends (blunt) to facilitate cloning 
of the fragments into the vector. Therefore, a 8-mer 

20 and 11-mer oligonucleotide (comprising the Sfll site) 
were kinated and annealed. After ligation of these 
adaptors to the DNA fragments a second size- 
fractionation was performed on an agarose gel. The 
DNA fragments of 400 to 1150 bp were eluted from the 

25 gel by centrifugal filtration. 

* Preparation of the pGALlPSiST-l vector fragment 
The small genomic DNA fragments were cloned after 
the GALl promoter in the vector pGALlPSIST-1. Qiagen- 
purified pGALlPSiST-1 plasmid DNA was digested with 

30 and the largest vector fragment eluted from the 

gel by centrifugal filtration (Zhu et al., 1985). 
Ligation with a control DNA fragment, flanked by 5fll 
sites, was performed as a control. The ligation mix 
was electroporated to MC1061 E. coli cells. Plasmid 

35 DNA of 24 clones was analyzed. In all cases the 
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control fragment was inserted in the pGALlPSiST-1 
vector fragment. 

* Upscaling 

All genomic DNA fragments (450 ng) were ligated 
into the pGALlPSiST-l vector (20 ng) . After 
electroporation at 2500y, 40/zF circa 400,000 clones 
were obtained. These clones were pooled into three 
groups and stored as glycerol slants. Also Qiagen- 
purified DNA was prepared from these clones. A clone 
analysis showed an average insert length of 600 bp and 
a percentage of 91 for clones with an insert. The size 
of the library corresponds to 5 times the diploid 
genome. The genomic DNA inserts are sense or anti- 
sense orientated in the vector. 

Candida albicans cDNA library 
Total RNA was extracted from Candida albicans 
B2630 grown on respectively minimal (SD) and rich 
(YPD) medium as described by Chirgwin at al in 
Sambrook et aJ. mRNA was prepared from total RNA 
using the Invitrogen Fast Track procedure. 

First strand cDNA is synthesised with the 
Superscript Reverse Transcriptase (BRL) and with an 
oligo dT-JVotI Primer adapter. After second strand 
synthesis, cDNA is polished with Klenow enzyme and 
purified over a Sephacryl S-400 spun column. 
Phosphorylated 5/11 adapters are then ligated to the 
CDNA, followed by digestion with the Notl restriction 
enzyme. The Sfil/Notl cDNA is then purified and sized 
on a Biogel column A150M. 

First fraction contains approximately 38,720 
clones by transformation, the second fraction only 
1540 clones. Clone analysis: 

Fr. I: 22/24 inserts, 16 ^ lOOO bp, 4 ^ 2000 bp. 
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average size: 1500 bp. 

Fr. II: 9/12 inserts, 3 > 1000 bp, average size: 960 
bp cDNA was ligated in a Notl/Sfil opened pGALlPNiST-1 
vector (anti-sense) 

Candida transformation 

The host strain used for transformation is a C« 
albicans ura3 mutant, CAI-4, which contains a deletion 
in orotidine-5'-phosphate decarboxylase and was 
obtained from William Fonzi, Georgetown University 
(Fonzi and Irwin) « CAI-4 was transformed with the 
above described cDNA library or genomic library using 
the Pichia spheroplast module (Invitrogen) . Resulting 
transf ormants were plated on minimal medium 
supplemented with glucose (SD, 0.67% or 1.34% Yeast 
Nitrogen base w/o amino acids + 2% glucose) plates 
and incubated for 2-3 days at SO^C. 

Screening for mutants 

Starter cultures were set up by inoculating each 
colony in 1 ml SD medium and incubating overnight at 
30^C and 300 rpm. Cell densities were determined using 
a Coulter counter (Coulter 21; Coulter electronics 
limited). 250.000 cells/ml were inoculated in 1 ml SD 
medium and cultures were incubated for 24 hours at 
30'*C and 300 rpm. Cultures were washed in minimal 
medium without glucose (S) and the pellet resuspended 
in 650 jLil S medium. 8 iil of this culture is used for 
inoculating 400 ^1 cultures in a Honeywell-100 plate 
(Bioscreen analyzer; Labsystems) . Each transformant 
was grown during three days in S medium containing 
LiAc; pH 6.0, with 2% glucose/2% maltose or 2% 
galactose/2% maltose respectively while shaking every 
3 minutes for 20 seconds. Optical densities were 
measured every hour during three consecutive days and 
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growth curves were generated (Bioscreen analyzer; 
Labsystems) • 

Growth curves of transf ormants grown in 
respectively anti-sense non-inducing (glucose/maltose) 
and inducing (galactose/znaltose) medium are compared 
and those transformants showing impaired growth upon 
anti-sense induction are selected for further 
analysis. Transformants showing impaired growth by 
virtue of integration into a critical gene are also 
selected. 

Isolation of genomic or cDNA Inserts 

Putatively interesting transformants are grown in 
1.5 ml SD overnight and genomic DNA is isolated using 
the Nucleon MI Yeast kit (Clontech) • Concentration of 
genomic DNA is estimated by analyzing a sample on an 
agarose gel. 

20 ng of genomic DNA is digested for three hours 
with an enzyme that cuts uniquely in the library 
vector (Sad for the genomic library; PstI for the 
cDNA library) and treated with RNAse. Samples are 
phenol/chloroform extracted and precipitated using 
NaOAc/ethanol. 

The resulting pellet is resuspended in 500 /zl 
ligation mixture (1 x ligation buffer and 4 units of 
T4 DNA ligase; both from Boehringer) and incubated 
overnight at 16«C. 

After denaturation (20 min 65«C) , purification 
(phenol/chloroform extraction) and precipitation 
(NaOAc/ethanol) the pellet is resuspended in 10 //I 
MilliQ (Millipore) water. 

PCR analysis 

Inverse PCR is performed on 1 ^1 of the 
precipitated ligation reaction using library vector 
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specific primers (oligo23 5' TGC-AGC-TCG-ACC-TCG-ACT-G 
3' and oligo25 5' GCG-TGA-ATG-TAA-GCG-TGA-C 3' for the 
genomic library; 3pGALNistPCR primer 
: 5'TGAGCAGCTCGCCGTCGCGC 3' and SpGALNistPCR primer: 
5 5'GAGTTATACCCTGCAGCTCGAC 3* for the cDNA library; both 
from Eurogentec) for 30 cycles each consisting of (a) 
1 min at 95 (b) 1 min at 57 '»C, and (c) 3 min at 
72 *C. In the reaction mixture 2.5 units of Taq 
polymerase (Boehringer) with TaqStart antibody 
10 (Clontech) (1:1) were used, and the final 

concentrations were 0.2 /iM of each primer, 3 mM MgCl2 
(Perkin Elmer Cetus) and 200 iM dNTPs (Perkin Elmer 
Cetus) • PGR was performed in a Robocycler 
(Stratagene) • 

15 

Sequence determination 

Resulting PGR products were purified using PGR 
purification kit (Qiagen) and were quantified by 
comparison of band intensity on EtBr stained agarose 

20 gel with the intensity of DNA marker bands. The amount 
of PGR product (expressed in ng) used in the 
sequencing reaction is calculated as the length of the 
PGR product in basepairs divided by 10. Sequencing 
reactions were performed using the ABI Prism BigDye 

25 Terminator Cycle Sequencing Ready Reaction Kit 

according to the instructions of the manufacturer (PE 
Applied Biosy stems, Foster City, GA) except for the 
following modifications. 

The total reaction volume was reduced to 15 /zl. 

30 Reaction volume of individual reagents were changed 

accordingly. 6.0 /zl Terminator Ready Reaction Mix was 
replaced by a mixture of 3.0 /il Terminator Ready 
Reaction Mix + 3.0 /xl Half Term (GENPAK Limited, 
Brighton, UK). After cycle sequencing, reaction 

35 mixtures were purified over Sephadex G50 columns 
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prepared on Multiscreen HV opaque inicrotiter plates 
(Millipore, Molsheim, Fr) and were dried in a 
speedVac* Reaction products were resuspended in 3 ptl 
loading buffer. Following denaturation for 2 min at 
5 95 •C, 1 /il of sample was applied on a 5% Long Ranger 
Gel (36 cm well-to-read) prepared from Singel Packs 
according to the supplier's instructions (FMC 
BioProducts, Rockland, ME). Samples were run for 7 
hours 2X run on a ABI 377XL DNA sequencer. Data 
10 collection version 2.0 and Sequence analysis version 
3.0 (for basecalling) software packages are from PE 
Applied Biosystems. Resulting sequence text files 
were copied onto a server for further analysis. 

15 Sequence analysis 

Nucleotide sequences were imported in the 
VectorNTI software package (InforMax Inc, North 
Bethesda, MD, USA) , and the vector and insert regions 
of the sequences were identified. Sequence similarity 
20 searches against public and commercial sequence 
databases were performed with the BLAST software 
package (Altschul et al., 1990) version 1.4. Both the 
original nucleotide sequence and the six-frame 
conceptual translations of the insert region were used 

25 as query sequences. The used public databases were the 
EMBL nucleotide sequence database (Stoesser et al., 
1998) , the SWISS-PROT protein sequence database and 
its supplement TrEMBL (Bairoch and Apweiler, 1998) , 
and the ALCES Candida albicans sequence database 

30 (Stanford University, University of Minnesota) . The 
commercial sequence databases used were the LifeSeq® 
human and PathoSeq™ microbial genomic databases 
(Incyte Pharmaceuticals Inc., Palo Alto, OA, USA), and 
the GENESEQ patent sequence database (Derwent, London, 

35 UK) . Three major results were obtained on the basis of 
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the sequence similarity searches: function, novelty, 
and specificity. A putative function was deduced on 
the basis of the similarity with sequences with a 
known function, the novelty was based on the absence 
5 or presence of the sequences in public databases, and 
the specificity was based on the similarity with 
vertebrate homologues. 

Methods 

10 Blastx of the nucleic acid sequences against the 

appropriate protein databases: Swiss-Prot for clones , 
of which the complete sequence is present in the 
public domain, and paorfp (PathoSeq™) for clones of 
which the complete sequences is not present in the 

15 public domain. 

The protein to which the translated nucleic acid 
sequence corresponds to is used as a starting point. 
The differences between this protein and our 
translated nucleic acid sequences are marked with a 

20 double line and annotated above the protein sequence. 
The following symbols are used: 

a one-letter amino acid code or the ambiguity 
code X is used if our translated nucleic acid sequence 
has another amino acid on a certain position, 

25 the stop codon sign *is used if our translated 

nucleic acid sequence has a stop codon on a certain 
position. 

The letters fs (frame shift) are used if a frame 
shift occurs in our translated nucleic acid sequence, 
30 and another reading frame is used, 

the words ambiguity or ambiguities are used if a 
part of our translated nucleic acid sequence is 
present in the proteins, but not visible in the 
alignments of the blast results, 
35 The phrase missing sequence is used if the 
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translated nucleic acid sequence does not comprise 
that part of the protein. 

Blastx: compares the six-frame conceptual 
translation products of a nucleotide query sequence 
(both strands) against a protein sequence database* 

Screening for compounds modulating expreasion of 
polypeptides critical for growth and survival of c. 

albicans 

The method proposed is based on observations 
(Sandbaken et al., 1990; Hinnebusch and Liebman 1991; 
Ribogene PCT WO 95/11969, 1995) suggesting that 
under express ion or overexpression of any component of 
a process (e.g. translation) could lead to altered 
sensitivity to an inhibitor of a relevant step in that 
process. Such an inhibitor should be more potent 
against a cell limited by a deficiency in the 
macromolecule catalyzing that step and/or less potent 
against a cell containing an excess of that 
macromolecule, as compared to the wild type (WT) cell. 

Mutant yeast strains, for example, have shown 
that some steps of translation are sensitive to the 
stoichiometry of macroroolecules involved. (Sandbaken 
et ai • ) . Such strains are more sensitive to compounds 
which specifically perturb translation (by acting on a 
component that participates in translation) but are 
equally sensitive to compounds with other mechanisms 
of action. 

This method thus not only provides a means to 
identify whether a test compound perturbs a certain 
process but also an indication of the site at which it 
exerts its effect. The component which is present in 
altered form or amount in a cell whose growth is 
affected by a test compound is potentially the site of 
action of the test compound. 
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The assay to be set up involves measurement of 
growth of an isogenic strain which has been modified 
only In a certain specific allele, relative to a wild 
type (WT) C. albicans strain, In the presence of R- 
5 compounds. Strains can be ones In which the 

expression of a specific essential protein Is Impaired 
upon Induction of anti<-sense or strains which carry 
disruptions in an essential gene. An In sllico 
approach to finding novel essential genes in C. 
10 albicans will be performed. A number of essential 

genes identified in this way will be disrupted (in one 
allele) and the resulting strains can be used for 
comparative growth screening. 

15 Assay for High Throughput screening for drugs 

35 /il minimal medium (S medium + 2% galactose + 
2% maltose) is transferred in a transparent flat- 
bottomed 96 well plate using an automated pipetting 
system (Multidrop, Labsystems) . A 96-channel plpettor 

20 (Hydra, Robblns Scientific) transfers 2.5 //I of R- 

compound at 10'^ M in DMSO from a stock plate into the 
assay plate. 

The selected C. aiJbicans strains (mutant and 
parent (CAI-4) strain) are stored as glycerol stocks 

25 (15%) at •70fiC. The strains are streaked out on 

selective plates (SD medium) and Incubated for two 
days at 30fiC, For the parent strain, CAI-4, the medium 
is always supplemented with 20 ^g/Tdl uridine. A single 
colony is scooped up and resuspended in 1 ml minimal 

30 medium (S medium + 2% galactose 4- 2% maltose) , Cells 
are Incubated at 30fiC for 8 hours while shaking at 
250 rpm. A 10 ml culture is inoculated at 250.000 
cells/ml. Cultures are incubated at 302C for 24 
hours while shaking at 250 rpm. Cells are counted in 

35 Coulter counter and the final culture (S medium + 2% 
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galactose + 2% maltose) is inoculated at 20.000 to 
50.000 cells/ml. Cultures are grown at 300C while 
shaking at 250 rpin until a final CD of 0.24 (+/- 0.04) 
6nM is reached. 
5 200 /zl of this yeast suspension is added to all 

wells of MW96 plates containing R-compounds in a 450 
Ml total volume. MW96 plates are incubated (static) at 
30fiC for 48 hours. 

Optical densities are measured after 48 hours. 
10 Test growth is expressed as a percentage of 

positive control growth for both mutant (x) and wild 
type (y) strains. The ratio (x/y) of these derived 
variables is calculated. 
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Claims 

1. A nucleic acid molecule encoding a 
polypeptide which is critical for survival and growth 

5 of the yeast Candida albicans and which nucleic acid 
molecule comprises any of the sequences of nucleotides 
illustrated in Figure 1, 2, 4 to 7, 9 to 11, 13, 15 to 
20, 22 to 26, 28 to 32, 34 to 43, 45a and b, 47 to 49, 
51, 52, 53 to 57, 59 and 60. 

10 

2. A nucleic acid molecule encoding a 
polypeptide which is critical for survival and growth 
of the yeast Candida albicans and which nucleic acid 
molecule comprises any of the sequences of nucleotides 

15 illustrated in Figure 1, 2, 36, 37a, 38, 39 and 40 and 
fragments or derivatives of said nucleic acid 
molecules 

3 . A nucleic acid molecule according to claim 1 
20 or 2 which is mRNA. 

4. A nucleic acid molecule according to claim 1 
or 2 which is DNA. 

25 5. A nucleic acid molecule according to claim 4 

which is cDNA. 

6. A nucleic acid molecule capable of 
hybridising to the molecules according to any of 
30 claims 1 to 5 or the sequences illustrated in any of 
Figures 1 to 61 under high stringency conditions. 
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7. A polypeptide encoded by the nucleic acid 
molecule according to any of claims 1 to 6 or the 
sequence illustrated in any of Figures 1 to 61. 



8. An expression vector comprising a nucleic 
acid molecule according to claim 4 or 5* 

9» An expression vector according to claim 8 
which comprises an inducible promoter. 

10. An expression vector according to claim 8 or 
9 which comprises a sequence encoding a reporter 
molecule. 

11. A nucleic acid molecule according to any of 
claims 1 to 6 or the nucleotide sequences illustrated 
in Figure 1 to 61 for use as a medicament. 

12. Use of a nucleic acid molecule according to 
any of claims 1 to 5 or the sequences illustrated in 
Figure 1 to 61 in the preparation of a medicament for 
treating Candida albicans associated diseases. 

13. A polypeptide according to claim 7 for use 
as a medicament. 

14. Use of a polypeptide according to claim 7 in 
the preparation of a medicament for treating Candida 
albicans associated infections. 

15. A pharmaceutical composition comprising a 
nucleic acid molecule according to any of claims 1 to 
6 or a polypeptide according to claim 7 together with 
a pharmaceutically acceptable carrier diluent or 
excipient therefor. 

16. A Candida albicans cell comprising an 
induced mutation in the DNA sequence encoding the 
polypeptide according to claim 7. 



17. A method of identifying compounds which 
selectively modulate expression of polypeptides which 
are crucial for growth and survival of Candida 
albicans, which method comprises: 

(a) contacting a compound to be tested with one 
or more Candida albicans cells having a 
mutation in a nucleic acid molecule 
according to any of claims 1 to 5 which 
mutation results in overexpression or 
underexpression of said polypeptides in 
addition to contacting one or more wild type 
Candida albicans cells with said compound, 

(b) monitoring the growth and/or activity of 
said mutated cell compared to said wild 
type; wherein differential growth or 
activity of said one or more mutated Candida 
cells is indicative of selective action of 
said compound on a polypeptide or another 
polypeptide in the same or a parallel 
pathway. 

18. A compound identifiable according to the 
method of claim 17. 

19. A compound according to claim 18 for use as 
a medicament. 

20. Use of a compound according to claim 18 in 
the preparation of a medicament for treating Candida 
albicans associated diseases. 

21. A pharmaceutical composition comprising a 
compound according to claim 18 together with a 
pharmaceutically acceptable carrier, diluent or 
excipient therefor. 
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22. A method of identifying DNA sequences from a 
cell or organism which DNA encodes polypeptides which 
are critical for growth or survival of said cell or 
organism, which method comprises: 

(a) preparing a cDNA or genomic library from 
said cell or organism in a suitable 
expression vector which vector is such that 
it can either integrate into the genome in 
said cell or that it permits transcription 
of antisense RNA from the nucleotide 
sequences in said cDNA or genomic library, 

(b) selecting transformants exhibiting impaired 
growth and determining the nucleotide 
sequence of the cDNA or genomic sequence 
from the library included in the vector from 
said transformant . 

23. A method according to claim 22 wherein said 
cell or organism is a yeast or filamentous fungi. 

24. A method according to claim 22 or 23 wherein 
said cell or organism is any of Saccharomyces 
cervisiae, Saccharomyces pombe or Candida albicans. 

25. Plasmid pGALlPSiST-1 having the sequence of 
nucleotides illustrated in Figure 63. 

26. Plasmid pGALlPNiST-1 having the sequence of 
nucleotides illustrated in Figure 65. 

27. An antibody capable of binding to a 
polypeptide according to claim 7. 

28. An oligonucleotide comprising a fragment of 
from 15 to 50 contiguous nucleic acid sequences of a 
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nucleic acid molecule according to any of claims 1 to 
6. 



Sequences with unknown function, C. albicans sequence NOT present in the public domain 
(ALCES/EMBL) 

>328c2 1803bp in-house: 1123-1803 public: M36/468-1021 PathoSeq- 
437-467/1022-1122 



ATGTCTATTACAGTrACAnTCCGAAATCTCCATCTACGAAAAAACGTGCACCG 
GCAnTGGAATrGAGTTGGAGTTYAG 

TCAMCAAGSCAGTAGCGATGGTGCTATAGAGAAAAGCGGCATTGGCAGTTCCT 
GTGTTTAGCGTTGACAACCAAGACTWT 

GTATTKATAAGAGAYCWTGCCAAGTACTGGGGCTACCCTTCATCGTATCAATT 
GATTGTCAAGTTGGTCAAATGTGCTAA 

CATrGAAAAGTCGCAAATCTTAAAGACCGATAAGGATITGAATAGAGAGrrGT 
TTGAGTTGGAnTGATrGAAGAAGCAG 

ATACAAAGATT GATCTT TmATATTTCGTrACCCTrGGTCTATTCAAGAATAGA 
AAATAAGAAGGnTTTTATGTTCTG 

CGTGAACCAGAACAGCCAAAGGTGTCGAAAGCMCCAACACAAGAGAAACCAG 
CAAGTGTGGITGCTGCAGAAGAAGATGA 

CGATAATCTAGATGATGATGAGGAGGACGAAGTGGATGAAGACATGGATGAA 
GATAATGATAATAGTGGGGAATTGTCTA 

AAGGATACAAGCACATGCACAAGGACCATCCAAAGTATATAAATGACGATAG 
GGTTACTATTGGACAAGTGTTTCATCAA 

TACGGACTTGACCCrrCGACACCAITAACCCATrCACmTCAATAGTATCAAC 
TCA ATGT CGAAGCTAAACTATTACAA 

GAATmGGAGTrrCAGGTrACCGATTrcrrCCCAACAGCAAGTTATCTrATGC 
AGAACGAGAATTGGTGTTGAATGCCA 

ACAACTACAATGATATGCACATTAACGAAAAGACAGAATCCAAGCCGAAAAA 
GAGTTTCCGTAAACCCATTGGAAAGTCA 

AAGAAACATAACTrGCAGATrGATCCGAACTCCATAGATTTAAGCGAGTCAGT 
GATTCCGGGACAAGGGnTATACCTGA 

CnTAGTATCCACCTATCTTTGCAAAGTCCCTAATTATTATGTGACATCAACCC 
ACCAAAGTCTCCCGCTGTCGTTCAAC 

ACAAAGAATCTTAATGCAACTTCGAACTCTTCGTA'nTGTTTAATGATAATGTC 
AAGATAAAGTCAAAAAGTATTCAGAA 

GTWSGTGTTCAACAGCGATACCGATAATTACCATCACACAAAGTATTTCTACA 
CCAAAACCTACCGTGGTCCAGGGTCGG 

GGAATTACAAGGATGOTGCATTGATGAACAAAATCAACAAGATACATCTTTCC 
AGTAATAAAAAGCCGCGCCACAAGAGA 

AAGGTGTCGAACAATAACAGGTACAACAAGAGTTTAAAGGGGTTAGTCCACG 
AAAAGriTGACAAGAACnTGTTGAGTA 

CTTGCTTTCTGAGCAACGCAAGTATACCGAGGACTATTCCAATCTTGAAArnr 
ACAC AATAGCTTACAGTTTAATGTTC 

nTTGAATACGTATCGTGGTGTTGCCCAAGAGACATGGAATAACTACTACAAG 
TTTAAATTGATTGATTTCGAACAATTG 

AAGGCTTTGCAAATGGAGGCAAATGAGCTTGAGGAGAGAAAATTGGATGCTG 
CTAGACACCAACAGTGGGCGGAAGAAGA 

GAAGCTTrNCCAAGAAAGATTGCGTTTAGTATTTGAAGATGAACGGACGAGTr 
TGAGCAATTGCAAAGCGAGTTTGGTCA 



GAGAAAGAAGGATTTGGAAGAGAAATrGCGTCGCCGTCAGCTANANGCATCTT 
TGANTGATAGTTTTGAACTTGATAGCG .p- / / 

AAAATGACNATGAATCTTGACTTGNCCAAANTNAACAAGACTT r ' J ' L ' 

>214c_cpLl 290bp inhouse:l-290 

GAATCNCANACTCGNCACNGCTCCCCAAAAAGGCCAACGTTCGTGCAAAAGGC 
TATACTGGTGATATCCACGCAGATGAA 

GAGCAAGTTTAATCAACTCTTTGTCAATrAATGCTGTACTTGTTTTCATTITATT 
TGCTGGCATITAAAGAATACCCATA 

GTTCAGAAAATAAAATTGAAAAATTTAAAAAAAAACGCAATATCATTCATTTT 
TTTTGlll I'll IGACAATAATATTAAT 

ATGTAGTTACCAATG 1 111" 1 AGATnTATATGrnTGAAAAAATAGTTTG 



Sequences with unknown function, C. albicans sequence present in the public domain 
(ALCES) 

>1 13g2 638bp in-house: 1-638 

CTTAmGGTTCTAGTGTCTCAATTGGTTATCCATTAACATCTATTCCCAACTCC 
ATCATTATTGGCAATAAATAAATGG 

GTGTTATATCTATTGGTAATAACTAAACTGGTGTCAATTCAATTCCAATATGGT 
CATGACAATTGAAAGTGITACTGTTC 

TGGTTTACATATTCTACAGGTTACAACTATTGATTGGTTAGAAGTTTGGTTTCA 
ACATCACCTGTTGCTAAGAATAAATG 

TTGGTCATATCAATTGAATCAnTGITGGTGTrATGGTAAGTAAATGCTGGTTA 
TATCTATTATCTACAACCACCAAGTG 

ATAAATGCTGAACCGTAGTCACCAACTGTTATGCTGGTTGTATCTATTGACTAA 
AACTACCCTAGGGATAAATGCTGAAC 

CGTGGTTACCAACTGTTATGCTGGTTGTATCTATTAACTGCAACCACCAAATGA 
TAAATGCTGAACCATAATTACCAACT 

GTTACATTGCTGGTACTACATTAAGAATAAATGCTGCATCTACAAGTACCACCT 
GTTGTGTTAATAAATGCTGCACCTGC 

TAGTACAACTGTTGCTGGTCATGATAGTTACTACACATTACACACCAGACAGTG 
GCAAACAAGGTTATGTAGAAACCA 3 

>11394 844bp in-house 1-844 

*IiS**^''^'^*TmCAACTATCTCACTCCCAATT(n(aCTOAATAWTAATiATACCT^ 

ATCnAACGTAATCraCKAAACKACAATCAATGTATAAAAGCATAAAGATAAAATCTTGCrrcAGGmiAGTTCATAAT 

TATAATGAACAiCAAnACTAAAAGGGATGCTATaAaAATTATAGGCTAGCTACUACCATAOTGGCTGTTCGGGAGTT 

CGQOTAGTTTXXGAAGGTTGGGAAGGTTGGATAGTTTGAGAAGGnCCGTGGCTGATrCTAAATTAACAGAGAAa^ ^ U 

AATGTACAAAAAAACATTCAGiATmAAACAACCrTTATATATATATATrAAAT(XTCTT^^ l<i " 

T(nTGATGATGCmCCTGTTiAAT4TA(XmAAGAACCAGAmACTATCTCAiCTAATATOA(XCTTATACTr^ J 

GmTGAaTrCCATAATCACACAAAAGATTGTGAAATATrmAGCaCAAGGGGATTaACICAnCCATCTCAAAa 

CACATTCTTTGTATCiCCAATACCTTTTGCTAACAGAGGAACAAAiAATTGACACGCATGTCATnACCaATAGCACTA 

jaaACAAATCAAAGGAmiCAATAGTGGGAATGTCAAATCATGTATAnATTUCACAmCACATAmArrrrCA 

GGTACATAATACTCAATATCTAAAACnCAAAATGGTACTGTACCTTAAACmCTC^ 

ACTTGCTAATGTCAAiAATCATGTCrrCACACATTCCAGGTTGT 



>117c_af623bpin-house: 1-623 



AACTGTCCTGTGAAGACGAACATCACAACCACAATCATGGTCATAACCAAAAT 
CACAATCATGTTGCTCCTATTCCTACA 

ACAGCTGGACAATCATTAAATAATAAAAITGATACATCTAAAGTGACAGCTCT 
CAACATGGCCAACTCTGCTGACGATCT 

AGCAAAAGnTTCAAAGATTCGACTAAAAAATATCAAATCAAACCAATTATCA 
AATCAGACAGTGATGAACAAATGATTA 

TCAACATrCCAnTCTTAATGGTAGTGTCAAATTGTAITCGATAATrCTACGTAC 
CAATGGGGAnTGTATTGTCCCAAA 

ACAATAAAATTATTCAAAAATGACACATCAATTGATnTGATAATGTGGATrCG 
AAGAAACCAATACAGGTGTTAACTCA 

TCCrCAAGTTGGTGTTGCTAATAATGATAGCGATGATCTTCCAGAGTTnTGGA 
ATCAAATAACGATGACGAnTTGTCG 

AACATTATGTGTCTCGACATAAATTCACTGGGGTAAATCAATrGACAATATTTA 
TTGAAGATATTTATGATGAANGAGAA 

GAAGAGTGTCAnTACATTCAATTGAATTGAGAAGGGGAATTCACTGAATTAA 
ACAAAGACCC 5" 



>lSel 977bp in-house: 1-977 bp 

m il [ mCAATATAGTTAGATCTCTrTTTAAAiTTTGAACACAAAUAACAAAAAGTAAAaACTATCACCACCACCA 

CCACCACCAAAACATCATAGTGGAACTTUrrGAAGAATATATTAATAACCATTAATTATAATAACATACTCAAAAGGAA 

TAGGAGTAAAACCTTTATATGTAAAmATTAAArACKrAAAAAAAAAAAAAAAAAAGGAAAGATTTmCAAATCTTGTA 

ATTAAATTAAATTTCATTTCATTTCTTGUAGTGATATAGTCGTAATAGCAGTAATAnAGCAATAATATTAAATAAAAC 

TTTAAAAATAACAATTiTUTAATAGTAATAATAAACGAATTTAACAAAACAAAAAAAGGGSGGGGAAGACAACGAATAT 

AGAAGAAGAAAAAACAACAAGACGGGTAGTAGATATATaGGCTTAAAAAAGaTATCTAAAGTACAGCAAACAaTAAT 

GCAGCAAGACAACCCAmAACAAGAATCATTACCTCCAGAACGTGGTTGTTGTTGTACATACATAGGTT G TTGn G nG 

nCTTGTTGATAATATCCACCACCACCACCTGGCTGTTGTTGATAATATCCACCTTGTGGAGGTGGTGGTCCATaGCAT 

TATATCCnGnGnGTTGATAGTGGCCATGACCkCCACCACCACCAaAAACATCCaCGATCnGTGTTTGTTGAGAA 

TAATTGGGTrGTGATTGTGGTACATAACTTTGTTGTGGTTGTTXSTGATTGGGGTTGATTATTATAATTTGGTGGTGGACC 

ACTAGGmACCGAAATArrCCTCTTmACATTCTTTATAmATA<XTGGTGTAAmi37rcTGGTGTGOT 

GATTGAGTATATAGAAGTTGGAAAATTTAATAACAATTAATCTAAACTTGATATAAGATGGATTAGCAATGATAATGAAG 

AAGTAAAGTTGUTGTG 



^•3 



1 CXISYVPQSQP NYSQQTCDRG MFSGGGGGHG H7QQCX1GYHA YGPPPPQGGY 
51 7QQQPGGGGG YYQQQQCQQP MYVQQQPSSG GMDSCIKGCL AUCVCCTLO 
101 MIF 



>17gl 731bp in-house: 1-604 public:605-731 



GCTGTAGrnTGCTTCCAAAAGTTrGATCTCGTCATCAACATCATTAACTTCATC 
TAATAAGGTGAATAATTTGGMTCCM 

KGYTCCACGTGSYYGYATCACTirrWATAGA'nTCACTrYGGACAAWAC'nTA 
TTTCTYYGYYGATCCCATTTCYKGAMA 

GATCCGTGTAATGmCKGCGYNUGACATGTCTTTATTATATMGTTCATTTAAA 
GAATAGTGACTCTCTGACAACTGATC 

AAAGGTCKGTARAATCCTACTrCGTAATTGATATATATGATTATTACCACTCTG 
TAGAAACTTGCCAKATITGACTGAAT 

CTTCGTATAACTCTKGTGTGAGCRAKTITACTCTGTrAGATAAATACTCGATTG 
GTGAKKGTGAAKTGTTGTCMTTrGAC 

TGGTACRKGGCTGCRGGRARAKKGATRGATTTKATCATCMAACTGTCCATGGT 
ATTRKRTAACAGTKCACTTYCTTTGAT 

AGAATCAATrAAAGTTGTGGTAGTCACTAGATrGGGnTATGATTGTCGAGTAA 
GTGTGATAGTTGCTCATCACTTATAT 

GCTTATCCAGAAATTTATTGTACAGCACCATGTGAGTCrnTGTAGCTGGTTTA 
TATACTTTATTTAANATGAACTCITC 

GGGATCGAGTTCATCTTCATCTTCGGAGGTGGAAGCGGGGATAGAATGTAAAC 
GTTTGATAGGGGTGTCTTCTTCTTGAT 1^ a "1 

AAGGTCCCAGA ''d 



>20794 7b9bp in-house l-TSS 

GCAAGATCTAAACTCaGmTnarrGTAATGmCACiUGCAAiCAiAATATAiUTCGAWiUGCCCCJkAATAATTCT 

CTTCTACAAAmCGAAAAATGmCACATGTATGAAAAAGCTTTATCTATACTATTTCTCaCCAAaaAGCAGTGAG 

AATGATACTGATATCTCCTAT7AGGATACAGTTATCTATTATAAGTATAATAATAATCATGGAGATAAATATATATTWU 

TCGATGGAGTTAACGAGAAAAACAATACAACCCATnTGCAGCAAAATGAGACATTTCACAGAAAAAAAAACAAGAAUG 

ACAATTACTCCATTCAAATAATTCCACAATAAAAAAATAACAAAGAACAAACGTAaAACAUAAACATCACTUTTTCA ft 

CmGAAAATaTrACATACTCAACTTCTAAAGArrAATAATAAGCGATGCATATTaTCAGAATTTAGTGTATAaATA v 

TGaGGTGATriTGAGCCAGGTGAAACAATTCTTrACTAAAAATCTAGGAGnGTrrATATACAGTATmTGTaAAAC 

CTCTCTCTAAOTATACAAGATAAGATrTGTAATCGGnAGAATAACAAGAAGGTGTGGnGTGGAaTGGTGGTGGTGG 

CAAATnGAATGATAT4nGnTATCTCAA<?rATAGCUATACAAGGGCAAAAGGCTGCAAaAAAaAGAACTTGGATT 

GTCGCAArTCTCTTCACCCTTTCAGAATGTCCTCGTGTATGTGATCAAT 



>222g8 543bp in-house: 143-543 public: 1-142 

CTACAAAATGAACAATAGACAAGTTCCTAGTAGAACACCACTAGACACCAACA 
CTAATCCATGTACnTCTCATTGGTAG 

TACCACTTTCAAATATCCAATCAAACACATCAAATTTGATTGTACTGTrCTTAT 
TTTTGAATGAGCTCAGNTGNNTTCTT 

CTGCTACCATAAAAGAATTTGGGAATTTCAAATATTGTACTTTCAAAAGNNGAT 
AATGCACTAGTAGAAGTGCCAAATAC ^ ^ 



TGTTGTAGTAGGmCTTTGTCmGGTGTATTAACAGATATTACnTATrGTCT 
TCAATATTGATTTGTTCAAAAGGAT 

CCTCAAAATTTGGCGTCTrCGTGCCTAATTCTGAATTGGATAATAGTATTTGAG 
CCGTAATCTCATCACTATCGNCTTCT 

TTACCATCATCTTCGTACCTATAATAGTAATAATAATNGTCAGACAATTCAGIT 
TCACCACTATNAATTTCAGAATCCGT 

TNCGTCAACAAGATnTTTAAAATAAAATrGTCAAACATTGACGNTGCAGTAG 
NGGTTGGAAA ^(Co^; 

>222g9 804bp in-house: 1-575 public: 576-704 PathoSeq: 705-804 

TTCCAGAGGCAACAAGCGGAAGAAGCACAACGAAAGAAGGAATITGAACAAA 
AGGCCGAATTTATCATTNGCATCATTAC 

TTGAAATGCGCCGAAGAGAAATAGAGAGGCGGAAACAGCAAAAGGAAAGGG 
AACAAAGACAAAAGGAGCACGAAGCAAAO 

AGGGATATCAGGATACAACAACTTTCAGAGCAGGAITCACGGAGTAATCAAAC 
TAAAGAAGAAGAGGAAGTGTTCAAGAA 

GGCCCGGTCTACTAATrCGGGAGCAGACGAGACTGGTITGATGTCAGATAAAG 
AGnTGATGATTCTGCATATTCACCCG 

ATTAnTGnTGAAGAGAAnTGTGGAATAAACCAAATCATCCAGATACAAATC 
ATAAAACCAAAAAATATACTGAGAAT 

GTGGTTGAAAATCTAGATTCTCCACCAAATGATACATCTGCGTACAATrCAAGT 
TTTCATGATGAAACTAATATTCAAAA 

TGAGATCCAAATACCAGAAAATGACGAGTATGTACCACAGATGAAAGCTACAT 
CCAGTGTCAATAATACCACCATCCCTG 

CACAAAGAAGACATGAGTCACnTCCACTTCTGAAAACAAAAGAAGGAAAnr 
GAAACAGCCGACGTTGGGGTTGATGGG 

TTAGATTCTCCAGTGCGGGCACAACCAGAAATATCTGCAAAATCCAAGTCTCC 
GATAATCCCTGGATGGTATAC irn ' lT 

ATGGNACCNAAGAGACTGGAAACTCCTGAAGGCAAATTGCTGTGCAGGGACC 
AATAGGTACATTATATTCCCTCANGGGG ^ 
GNCC f^n 1^ 

>226c_a(l 766bp in-house l-7D£bp 

AACGTAAmnATATmACCiiGCrrAACA<»X;A(XTCmATCAmcrrTGTCAAnCAATrACTCCAGAi 

AACACAAGACncmarcnCKTATTiAAAGATAJiTATATAATCACWATAAAAGAATTTTmGGnUAGAAAAm 

aCCKACOTrAAATaTTCTrCnCCCTATAAACCAAAAATCnATATCTCCCAAGTTAAOTAmGAATTCCAAGATT 

ATTTACTTTACAGTGAATCATTAAA-AnTTAATOAAAGCGAGnTAGCTCUTGTCnCAGACAaACTGCTTTTCAG 

GCACCACCAACAAAAGCACCAGAAGCCTCCATGGATCTGGGTACAATTCCCAAlAGATCTCaGCAAGATTGTTTCAAAG 

GTGGATATCATCATCATCATCAAAAGATiAGCCAGTATATGCAGAAAAAGCCCTTCTCAAGAAGCAAAACATAGCACCGG 

AACCAATAAAAATAACTAAACAACAAGTACCAGCTAAACAAATAGGTAaTCTGAACCATCGTCGCCTCTAAGTGTGGCT 

TCGAGTCATGATAATTaTGTTCCGAnCAAGTGCAGCTTCTATATTTTCTGATrCTAAAAATAACAATAGTATGCAAAT 

GTTACTCACAGATGATATAGAGGACATATTAGAGGACATAGACGATGCTGAGATATACGATQCTGAGAAGGTTACCATAA 

CATATATAAGTTCTAAATCATGCrAATACACATTATTAATTATTrG 



>233c cpl full SOObp la-aouse. 1-SOO bp 

GAAAAlTCmCAACAACUCAJkCA0TAAO:CAAGTGATAGTACCAAATCTACnTrAGCAA4TGATGAAACAAGU 

ACTTGATCCTAAAGGCGrrGGAAGCACTACAACAGGTGATAAAGACACAGmCATCAGACAAAGCATCTCTGCCAATTG 

UGATAAAGAAAGTTCACCATCCaAGCTGGAAGTTCAACATCAACACaAGTGGAACTGATAAAAAAACATCTCCTAAA 

UATTAGnACCAATGCTGTCAAtAAAGTTGAAAATAATGATGATTTCAAAAUTTCATTAATGAGGCTGAAAAGGAAGC 

TAAAAAATCCAAATCTGGATTGAAAAAATTAmAACAAGAAGTAGAAGnGmAAAnGmCGATATAAAT^ 

AATTCCAGrmmATTmATmAnTTATmAGTTTAGTnTGAmaTTm 

CAATATAATGTTTAmTTT 



>22g3 (5") 535bp in-house: 1-535 

AGGTTCCAGTTACCAATTTAGGAAGTGTGTTGCAAGCAGGGCTACCAAATATG 
GGTGGCAACACATATGGTAGTAAGTGC 

TACCAATGTGGGTGCAAAAAATnTGCCAAGTAATTTGTATGGCAATAACAGA 
AGTGTTGGCGGATTCNAACTGAGGAAT 

CTTTGGTGTGTAAAAAAAAAGCAATAGCGACTACGCTACAANAGGCAATCNAT 
TATTATTATAAAGTGGAAGTTATATAT 

ATmTCrCGGGGGGGGGGGGGG>mNGGlWTCCCCCCCCCCCCCCCCAlWTrr 
TNTCGGCCCNCCCACCNTNCGGCCTTC 

TGGCTCCCCCCCNNCGGGCCNCNNGTAAATNCCTCCACCCNGGGANAANGGNA 
AANGGGGAACNANNAAGGGGGGACNNN 

NCACCCNATGGGAGGGAAAATCCCNAANNTTTNCCCCCCCNNCCCNGCCNAAN 
CCNCNTGGGGNGGGCCAAANNCNGGGG 

GCTNCNCNCccTOCccccccGCCNT^^s^ccc^lN^^^caw 

GC 'F,^ 
>22g3 (3') 426bp in-house: 1-426 

CCCCCATATAACGTTGTCA ATAGCA ATACTCTGTCGCACCCATAGTGTGCACTT 
CTCGGTGGTATAAAAAAAATmTTC 

TCCCAAAAAAAATCTTCTCCmCCACCACTTrmCTrCrrCTTCCTTCCCCATT 
CCCTCCCAAATCCCTCATTTTCCC 

CAmCCCCTACCCTCCTGGCCCTGTATrCCAAAATTTTCTCGGGGNTACGCCC 
CGAAGANAACCTCCCTCCCACCCACC 

CATCTrrGTCNGGNrrCGACCTTCGGCCTCANGGCTCCACCGTCGGGGNTCTTG 
TATATTTGTAGACTCCNGGAAAAAGO 

GAAAAGGGGAGGAAGAAGGGGGGAAAAAAAAAAAANGGAGGGNGAATCCTT 
TTTNTnTNCCCCCCNTCTCAAACCNAAA ^ 
CCCCNTNTGGGNGGTCNAATTAGGGG h^'l ^ 



>24gG 522bp in-house: 1-522 



GAGCMGC>n-GGCCAAAANNCCCACCTrGTnTGAATAGG'nTGCGTrGTAYA 
GGCAG TYAAATGTGTTmTKGGCTTG 

ATnTGAGAAAAAGTTGACTGAAAAAACATGCAAGAAACGGGGTGATCATGA 
AAATAGTCACACACAAAWWGTCAAAAGA 

CNATCACMGAMGACTCAGAATGAGCAGAAGGACTTGTCTGAATTGAGTnTCA 
ATTGTTA TTTAG AGTnTAAATTAGAG 

TTGTAAATTTTTGGTCAGATTTrACGAAAAAACTGGCTGAAAAAAAAACGAGT 
CAGGGTGATTGCMAAAGAACAGAAACA 

ATMSATAATCTTAAATrAAGGTAGTAAAGGCTCTGTGAAGTAAnTAGAGnTA 
AACASGGGGGGGCGAGTCAKKKTTAG 

AG^TGTGAGTCT^^GTITA^ITGGCtAGTGAAT^GACTGGCMMGAT^G^TAA 
ACGTKGGGTAGRAAAAGACACCCTCMC 

SCGTYRMCTNTTGGCCAATTCAACNCGTCCCNGGANACCGCC ^ 75" 
>28gK 475bp in-house: 1-475 

CCCCGTTAACCACTTCTAGGTATACCATrrCATCTGACTGAATAACTGGITAGT 
CGAnTGTTGTTGAAGAAAAGTGACC 

ACCTAGl 11 111 CTGCCAACATmTTGCGATGAGCCGTCGACGCGTTGTCTTTT 
TCTACCCCACGTTTAACAATCTTGC 

CAGTCAATTCCCTAGCCAAATAAACnTAGACTCACAACTCTAACACTGACTCG 
TGCCCCCCTGTTTANNCTCTAAATTA 

CTTCACAGAGCCTTTACTACCrrAATTrAAGATTATCTATrGTGTCTGTITG'nT 
GCAANCACCCTGACTCGNNNGGNNT 

TTCAGCCAGTTITITCGTTNAATCTGACCAAAAA'nTANGNCNCCGA'nTAAAA 
CTCTAAATAACNATTTAANGNCGGNT 

CAAANAGTCCTTCTGCGCGGNNGANTCGTCNCTATTGTCTTGNGANATGATGT 
GTGTGNCTATNTTCAAGAACAT ^ , 

>328cl 681 bp in-house: 1-681 

AACCTAAATATGCCCGAnTAAACAAGTTGATGTATTCACCAATGTCAAATATT 
TGGGTAATCCAGTTGCCGTTAnTAT 

GATAGTGATAATTTAACCACTCAAGAAATGCAAAAAATTGCTCGATGOACAAA 
TTTATCAGAAACAACATTTATATTGAC 

TCCAAAATCATCAATTGCTGATTATAGTATTAGAATnTCACTTCTGGTGGGAA 
TGAATTACCATTTGCTGGTCATCCTA 

CnTAGGTACTGCATITGCATTATTGGAAGATGGTAAAATAAAACCAAATGAC 
AATGGACAAATAATTCAAGAATGTGGT 

GCTGGA1 TAGTG AAAATATCCGTTGAAAAAACACCTAATAATAATAGTAATGA 
GTTGCCGTTnTGTTATCTTTTGAATT 

ACCATATTTCAAATTTCATGAAATTGATGACAAAGTAATCGAGGAATrACAAC 
ATTCATGGAATGGAACCAATATTATTG 

GTAAACCCGGTACTTATTGATGCTGGTCCAAAATGGGCAGTnTCCAACTTGGC 
TCCGGTAAAGAAGTATTAGACTTGAA , — 



TGGTGATTTAGCACAAAATGAGAGATTAAGTTTAGAAAATGGNTGGACAGGNA 
TTGGGNCnTGGNAACATTATGAAAAT /-T/roi-f) 
GGGGATCGGTCCAATTGAGAAATATTGCTCCTGCTGGTGGA tlj '^^ ^ 

>33gK_partI 1171bp in-house: 1-588 public: 805-1171 PathoSeq: 589-804 

TCTAAAATCTTCAGTNCCATNCAGAGTTTTAGGGAATGTrACAGACTCAACTCC 
TTTTGCCATGGGGACATTAGGTTCAA 

CATrrrATGCTGTCACrrCTGTTGKCWGATCnTCCAAATYTAKGASKKGGYW 
ACATTACATTTATTGnTGTTrCCCAA 

ACTCAAACTCCTTCAAGGAATWACATGTTTGGCTGCACACCATCAMnTGTCT 
ATGCATCTKATGGTGATCGTATTGGTA 

TnTTAGACGTGGTAGATTAGAGCATGAATTGGTTrGTGAAGGGAACTCTACAG 
TYAACCAATTATTAGTATTTGGAGAA 

TACCTTATrGCTACCACATrAGAAGGTGATATTTTCGTATTTAGAAAAACTGAA 
GGAAAGAAATTCCCMMCTGAATTATA 

CMCTACMATCAGAATAATTAWTYCnTAGTTGAAGGAGAAATTGTGGGATTA 
ATTCMTCCACCTACGTATTTAAWWARWR 

TV^^TTGYTSCWMCYACYSWWTCTGTGTTTGTTATAAATGTGAGAACTGGCAA 

ATTATTATACAAATCCCGGGAATTACAA 

TTCGAAGGCGAAAAGATTTCATCAATCGAAGCTGCTCCAGTTTTGGATGTAATT 
GCTGTTGGTACATCTAATGGAAATGT 

ATTTTTATTCAACATTAAAAAGRGGGAAAGTGTTGGSCCAAAAAATTATTACTT 
CTGGAACTGAATCTTCCTTCGAAAGT 

TGCCTCGATCTCYTrTAGAACAGATGGAGCACCTCATTTGGTTGCTGGTITGAA 
TAACGGAGACTTATATTTCTACGATT 

TAGACAAGAAATCACGTGTTC ATGCTT TGAGAAATGCCCATAAAGAGATTCAT 
GGGGGTOTrGCAAACGCCAGATTTTTG 

AATGGTCAACCAATAGTATTATCAAATGGTGGTGATAATCATTTGAAAGAAnr 
G l ' llll GATCATAATTTAACCAGTTC 

GAATTCATCCATTGTTCCTCCTCCAAGACATCTCAGATATAGAGGTGGGCATTC 

AGCACCACCAGTAGGTATAGAATTTC 

GTCAAGAAGATAAAACCCATTTTTTATTGAGTGCTTGTAGAGATAAAACATTTT 
GGACATTCTACTTTGAGAAAAGATGC 

TCAAGCACAGGAATTGTGTCAAAGATTGCAAAAATCTAAGGATGTATAAAG 
>33gKj3art2 1001 bp PathoSeq: 1-1001 

ATATCATAACCGCCCACAAGGATGAAACmTGCGAGAACATGGGATTCAAGA 
AATAAAAGAGTCGGTAGACATTTGTTA 

AACACTATTGATGGTGGCATTGTGAAATCTGTATGTGTGTCTCAGTGTGGTAAT 

TTTGGnTAGTGGGATCATCACTGGG 

TGGTATTGGATCATACAACCTTCAAAGTGGATTGTTGCGTAAAAAATATGTTTT 

ACATAAACAAGCTGTCACCGGTTTAG 

CAATTGATGGAATGAATAGAAAAATGGTrAGTrGTGGnTAGATGGAATTGTG 
GGATTCTATGATnTGGAAAGTCTGTC 

TATTrAGGCAAATTACAACITGAAGCACCTATAACATCCATGATATATCACAA 
ACTGTCTGATCTTGTTGCTTGTGCCTT 



GGATGATTTGTCCATAGTTGTTATTGACGTGACTACTCAAAAAGTCATAAGAAT 
ATTATATGGTCATACCAACAGAATTT 

CAGGAATGGATTTCTCGCCTGATGGGAGATGGATAGTTTCAGTTGCATTGGACT 
CCACnTGCGAACTTGGGACTTGCCA 

ACTG GTGGTTGTATTGATGGGGTGATTTTACCAATTGTGGCAACTGCAGTTAAA 
TTTTCTCCTATTGGTGATATCTTAGC 

GACAACACATGTCTCTGGAAATGGTGTATCCTTATGGACTAATCGTGCCCAG1T 
CAAGCCTGTGTCCACCAGACACGTAG 

AAGAAGATGAGTTTTCAACTATTTTATTACCAAATGCTTCTGGAGATGGCGGTT 
CAACAATGCTAGACGGG 11 ' I'll GGAC 

GAGGATTCTAATGAAGACGGCACTATTGATGAACAGTATACATCTGCTGCTCA 
AATTGATGCATCCTTGATTACTTTATC 

ATCAGAGCCAAGATCAAAATrCAACACnTATrGCATTTGGATACCATTAAAC 
AACAAAGCAAACCGAAAGAAGCACCTA 

AAAAACCAGAAAATGCACCnTCTTTTTACAATTGACTGGA ^« (Cc^v^-O 
>33gKj3art3 414bp PathoSeq: 1-414 

AAATTGCGTAAATTGGATACAAACGGTAACCACGCATTTGAAAGTGAATTCAC 
AAAACTATTAAGGGAAGCTGGAGAGAG 

TGGACAAmGAAAGAmTTGACTTACTrACTTAACTTATCTCCTGCTGTATTG 
GACTTGGAAATTAGATCACTTAATT 

CAmGTrCCATTGACTGAAATGACAAATmATTCAAGCTTTAAATGCTGGTTT 
GAAATCAAACGCAAATTATGAAATA 

TGGGAAACTTTATATGCCATGTnTTCAACATACATGGTGATGTTATCCATCAG 
TTTGAAAATGAAACTAGTCTTCATGA 

AGCT TTGGAAGAATACAGACAGTTAAATGATGAAAAGAATAACAAAATGGATT 
CnTAGTGAAATATTGTGCTAGTATCG TT/c ySJ- rr<,^r) 

TAAGTnTATTAGT \^ / 



>358K 1334bp in-house: 146-669 public: 1-145 PathoSeq: 670-1334 

ACAACGTATAATCGACAGnTACTATATCTGCTGACTTCAAAACCAATGCATTC 
TTCAAGCGTGCTCTGTCGATTTCTAT 

CATAACATCCACnrCCGGNGTAATCGGATTACTAAAGCCACAGAATCAAGGT 
GAACATCAAGCTTCAACTTCnrCTTG 

GTCCACGAATAATTITAATT^GGTTMTTSKKGS^MMKGCTTTCTACRGTAGGTT 
TGAATCTlTCCAACATTGTCnTGCA 

TAGAAACMGCACCAGACAAGAAACATGTCCACTCGACCATCAACYTSKGGGT 
AWWGACAAAGTWAATCTGTCTGGATCCT 

T TTCAT CCAGTTTCCCTGCATKGGAWACAAGTNTGTCCCGCACAGTTAAGACT 
G 11 rn ATTTTSKTGGTATTAGACTCA 

TCAAGTTCCGAAGGAGAGGCATCATTTARGGGWATAGACTCCGCTGAGTTAAT 
ACTGGATAAATCACTTAnrCAGATTC 

ACTGACTTGTWCTTCAGTGACCTTATCAAAATCCTCAATGTACTCSGARGCGTW 
TTCMCTCMATGTGAAGGCTnTAAAA 

GGGCAACRCTGGTR'CAAAATGCnTCTrGCRAGTTTGTACKTGACAGAAAAA 
TCAAAAACYTTGAAAGATATACCTCTT 



CTAAAGTCTTTTAAATCAATTTCTTNTCCTAATTTTTCATCATATAGCTTATGAC 
TTGGCAAACCCTCCTTACATACCAT 

ATCCATTACAATGCTAGAAATGTCAATCTTCACTGACGATATAAAGGATGGAA 
GAACTTCAAATAATTTTATAAACTCAG 

GATTGGCTGGTGTATCTGCTGCAGGAGCTCCAGATTTATTGTCCATrTGCTCAC 
TCCATGGACATACATTATTAACGTCC 

AACAGACGTACAATGTGAAAGATAA 

GATCATTAGCAGAGAGCAATTCGAGACTCTTGCTTGAAAGnTGATTGACACG 
TTTTGTTGTAACATATTGTAGGTGGCT 

AAAAGATTGACnTWRGTAAAATGRAACTTATTAACCCTGGGCCCTCACATTTC 
ACATmrCATCTTAAACAAAGKGGTT 

CAAAGKGGAACTTGGTITGGATCCYTTAWTGGAAWATTTCYCAGKRAATACTr 

TCAAAATCAACTCCAGGAGAGCCACAG 

TGATAATTGAATTGGATITAGATAAGCGGTTAAACTTCCCAATrrCAGTTTTAC 

CAAACTCTGGTAAATGAAGGTTAAGT 

TrTGTGTCCACCACAACAAGTrrACTAAAAACAGCCTTGAGCArnTGGAGGCA 

>36g2 (50 520bp in-house: 1-520 ^ 

CGTATAGAGAATAATCCGTTGAAATrGATTGTTCAATCATTATTGTATCl'l'llCC 
CnilllliGTTCTAACCATAATGT 

TAGAATAATTAGAAATTGTCTAAATATATATTCAGTTTAACAAAAAACAGAAT 
GCTrGCAATAAGATTTGATTTCTAATT 

ACTAATCGTTAATAmAGTTTGGTGGGGTTTTATTTATCGAAGATGTAGCATT 
ATTTGTATCNAATAGATAAAGAAACT 

TGAATTAAATGGCNTAATTTGTTGCAATAGTAAAAAAGAAGAAAAGTGGTAAG 

GAGTGAGTGAAAATATnTTTGCCCCA 

AmGAGTNGAAATCTrACACCNAAAAGTTTGGACNAAAAAGTnTTACTAAA 

ATCTGANAATCTNCCTGAATAGAACCG 

ATCATCCNCATNTCCGATT TCNTG AGGANAGATAGTGGCCCCACCTCNTGGTG 

ATTAGAAGGAGCNCCCATGTnTACAA -7o,/,j 
TATCTATATCCAGAATAACNTGTITGTGACCTCNCCCCNG r{j ^^C^J 

>36g2 (S") 472bp in-house: 1-472 

CTCTATATATAGTGAAATATAACATCAAATAATGTACAAAAAAGTATAATAAA 
TTGATTTAGAAATGAGAAAAAGAAAAA 

AACTrGAAGTAGTGAAGATATATrrGTTGGCTATCTTTCTTGGTATGGCTCAAT 
TCAGCCAATCnTGGATGAAAGGTTGG 

AGTirrAGTrTCGTGGTTrATrGATTTGTAAGTACTTTCGGGCTAGAAAGTTNA 
CAAACATGATTAATCTTGATATANAT 

AmGTTAAACATTTGGTGCTCClsrrCTTAATCNCCCAAAAAGTTTGGGNCACTA 

TCTTTCCNCCNGAAATCTGTATATGT 

TGAOTGANCCGNTCCATrCCTGTTNAhrmCNGANmAGrTAAAACCTTTTTG 

TCCCAACCTnTGGGGTTAGANTTCN 

NCCCCANTGTrGCCNNAAATATmCNCNCO^CCCTNCCCCmCCCCNTnTAC 

NAATGCACCAAGTAAGCG 




>38gl 1348bp in-house: 1 83-940 PathoSeq: 1-182/941-1348 

TCTCTGGTATAACTTGCACTACCTCATCGCTACCCCGGATITrTTmCKjTATGA 
TC TACACGTC CTCATCGCTACCCCA 

GATTTrmTCTGGTGCGCCGGACACGCCCTCCGGTCCGCACCGAAAACCGGGG 
TAATCTCCGTCGGAGATACACATCCG 

CGGACACAAAATCAGATGAGCTACCACCGAAAATTCCGAAATTTCAAAAACTC 
AAAATCCCTAAAAACAAACTATCCAGA 

NATTATTGCCATGCCCTGAGGATGAGTTTAGTITnTAATmTGAAAAATGTC 
CAAAACTGGTTGTGCTGTATAGGANG 

GGTAAGAATTTGCCATTCTGCCCCnTGGGTGGGTCAGTCNAAAAAAGANGTA 
TCACTCTGGTTCNAACGGGAAACAACN 

NAAAATGGGATTAAAMTWATCTCCAGAMCAAACTTAGCTTMWWACACCCAY 
TrTA GT TGTAC TSGYGWRCCMAAMMCMAA 

TTTrCCATmGmGGGGANGGGAATTTARACCAAA W114irri - riT GAAATrT 
CGCT MAGTGTY MAGAMCCSCAAAAG 

TCACCTTTTlTCGTTTrCMMCYACGGCARARGCYCACCGGTnTKYKTGGKGS 
MCRGCCMAATTGAWTTTGTGGGTGSGC 

ACGKGGAAAAACAGTTKGTTAGTGGACACGrnTTGCAGTGTGAAACTGCGCT 
CGGAGGTACTATATGCGAAAGCAGAAA 

AGACAATTGCAAGAATACAGAGAGTTCTTCTCTGGGCTANNGCAATGTGTTTA 
AGGCCAAGTCGACGAGTGGGGAGAGTC 

TGGAAGTGATATACACATCACGACCTAdTTATACGCTACGTTCGGCATGGGC 
GAGCCACTGTACGGTGGCAAGCCTGAA 

CAGTCCCACACCAGATATCTAACGATTCTGTGTATGGGCACTGATGGGATTTAG 
TGGATTACTAGCTGATAGCAAGTATT 

GAAAACTAAAACCCGACTCGGGGGTATGCCTTGGCAAGTAGCCGGAGTAAAAT 
CTGTGACTITGCTGAGTGTAACTCCCT 

CCATGGTTGGCGATGTTCGACGTGCGCGGCAGTTCTTGTCGTATCACAGTCGCA 
CGGACACCACACCGGGAGAATCTTAA 

GAGGGCTATATGGATGTGGAACGGTTTGCTTGCTGTGGTAAAACACTGGCGGG 
CGAGCCGACGTTCCACGGACACAGCAA 

TGTGTITGCAACCAAATAAATAACTrGTACGGTITGAACGTGTrnTGGCTGCT 
CCTTCCAGTTCITGGCGGGAGAAGCT 

TGGGCGCGGGAAGACCACTACTACGTAGTTATCTGGTTGATCCTGCCAGTAGT 
CATATGCTTGTCTCA p,c XI 

>3gG 842bp in-house: 171-842 public: 1-170 

ACCANTAAGGGANGAGGTGGACAAATTGAACCAGAATTGGGAGCCGCCAAAT 
TGGACGGAAACCCCGGGGAGGCGGCCAA 

TTTGGGCGAGAACCCACGGATGGACAAGCGGGGCAAGTGCCAAGGAAGTTGC 
GTGCCAGTGAAGTGGAATG1TGGAGAGC 

TGGCAATGAACTGCCCGGCAAGTGATACCAGGAGATAGGTGTGTATAGATTAT 
NATGGAACGCCNATTTTTGCAGTATCA 

CGCGTAATAAGGACAGCAGTTGGACATCGGTACATGAGAGAGCAATGTNAGTC 
TTGATANTAATGAGCCGTGTTGAAGTA 

GTATmAATCNAATTTTACTCCCAAAAGGACAATGGAGATCTGGAGATAACN 
CCACACTAATCGGTTCTAGACATAGAC 

I" 3 



TAANCCTGAAAGiXKjGTACTACAGCTTGTnTGAAAAGGTTTGCGTTGTATAG 
GCCAGTAAAATGNWCTTTTNTTNGGG 

TAGAATTTGAGAAAAAGTTGACTGAAAAAANCGCAAGANACGGGGTGNTCAT 
GAAAATAGACACACACAAAACCGTCAAA 

AAACAATGGAAAAGCTTCNNCATACGCAGTAGGAGGTGTCTGAATTGAGTTTG 
TATrGTTAnTAGAGTnTAAATTAGA 

GTTGTAAATTTrNGGGTAGAATTTACGAAAAAGTCGAACAAAAAAACGACAAG 
TCAGGGTGATTGCAAAAAAACAGAAAC 

AATAGATAATCTTAAATTAAGGTAGTAGAGGCTCTGTGAAGTAATTTAGAGTTT 
AAACAGGGGGGCACGAGTCAGTGTTA ^ 
GAGTTGTGAAGTlTATTTGGCTNNTGAATrGACTGGCANGAT 2i^6*M + ; 

>480c 73 1 bp in-house: 1 -73 1 

TTTGAANTCTTCNCCNCCTGNNNCTNCTCAAAGCGCTCCGCNrrGTGNCTNAAN 
GGGCTGGCTCAACGTACTATCAAGNG 

TAACATAAAACTANNACTTGGGNCATTCACGATAAAGAAACGGNGCCTGGAAT 
TCCAGCAATTGNCGCTGGACnTGAG TC 

GNAATCTAGGNATTGNTAGNCGAGAnTTNCATATGAANNACCTGGGAAACGG 
ATCAACTGGCTTAACANGACCAGTATT 

GATGNGGAGAGAAAAGTGGGACTTGCAGANnTCTCAATANCCTCATTCAAGA 
CTCAACACTTCAGAATGAACGAGAAGT 

GTTGTCG^mT^GCAATTGCCGNCTAAT^TAGATTCACCAAGGATATGT^ACA 
GAATAATCGAGCNNACTTGGATTNTG 

NGCAAAATAACTGGTACGATGTATATCGTAAGTTGAAACTGGATATACTCAAC 
GAATNGTCTAGCAGCATTAGTGNACAG 

ATACATATTCGTGATCGCATTAGNCGGGTCTACCAACCACGGATTCTCGACTAG 
GCAGGNCTATTGGTACAGATAAAGAA 

GANGCCTAAAGAAGAAGCAGOTGGGTTCCCAATTCTrTNAGAGNNTAGAAAAT 
TTGGTAGTACAGGAAGTTTCCCGATCA 

AAGAGGGTGTTGGACCCAACAGMTAANKmCGGCNGNGNCNTThn'CATrNA 

ACAATAAGNACTTCTTNAACACCAAAT 

CQJNGTTTTTA ^ 

>55g3 1063bp in-house: 533-1014 public: 88-532 PathoSeq: 1-87/1015-1063 

TTCNCCCCCATTCCAAGATTCCCCTTGTAAGTAAATTGGTTGGAGAACCNCGTr 

GGTTTAATTCCCCCGNNCGGGAAANN 

ATTNGTNGTAAGACAATTCTITCAACAATITATGATGTrGCTGCATTTTCGTAA 
ATCTGCTAGTAAGAGCACCCTAATCT 

TCGAAATCAATAGAAAACTCAATCmrCGATmCTTGTGGATCACnTACTC 
ATAGGCTCGCTATAAACAAACAAGCT 

CACTATGATAACATATAAGGTAAAGTATTGACnTGATGTAGATGTAATCACCA 
CAGAAGGCATATCAATACCAAGTCGAA 

GAGGAGCCTGTTTCGAATTATCATCATTGTAATrCGCTTGCTCTTCTTGATCCA 

GATCTTGTGCATTTGGAGAAAGCTTG 

CTrGACAAAATrrCACTTTCATAACACATTGTCATAACAGAAAGCTTTTCAATC 
AATAAGTTGTTGGCTCCAGCCCATTT ^ 2^ 



ACCATTrTGGGTTATTTCTGTTCCTAGCCATGGTGGCCAATTAGATITAGCTCC 
ATATGGATTACTAATACTTAACATAT 

CTGGACACCCTACAATATCCTCrrTGTITAATACAAAAACATTGGCATCTnTA 
GTAAAATACCATACCTAGTTTCTAAC 

AAGATTrGGTTTGCATTATTCCTGGAATCCCTAATGGACAAAATrrTACCTTTA 
ATAGATGGTGTAGAGATGAGCACNAC 

AGAATCAGGATAATCCTCNCnTGTTAITGAATTTGAGGTGGGANCACNTCNAT 
CAAATAATCCTCAACCACTmrCGT 

CNGGTCTGGTCTCNCNTAATATTGTGTTGAAATTGTCCAGTTTCCGTrGTGAAG 
TGANATCNTTGCCGGTAGANGTCTGT 

GATTTAACCCCCTCCCGTGTGNAACGACCGAATANTrCTGACGTCCANNAAAA 
ANCGTCCCTA TCGTTGT TTGCNTTTAC 

NCCATCCNCCNTTITTrCCAGTGTTrrCCATGGACTTGTGTGNCGAGTrATTTCG 
AAGCTGTGAnTCAATTTCACAAAA -r ^, ^\ 

TGTATGTATTTCAATGTCAAATT >^'^ CCo*tT ; 

>58gA 724bp in-house: 281-582 public: 209-280/583-724 PathoSeq: 1-208 

GTGGTGTnTGGAAAGTAAGGTGTGAnTGCTTAACAAAGGAAAAGAACGAGA 
CGAGAACCAATTCTCAATATATAGGTC 

nTCCAGGTAGGAAAACGACCAACTGTGGAAGAATGGCACTACCATTGGTTAC 
CAATACAACTACTGGTTGTCGTTTCAT 

CTGATAACATACACAACTACCATGTAGAAGTCATCAATTGTAAACTCAGCTAT 
ACTT TATC AATATCAGCGAAATTTTAT 

TTAGTTTTCGTTNGAAAATACAGTGAAAAATAAAAATCTACCCGTCNNCNTGA 
ATTGTNTNCCTCTGCANCNACAAATNG 

TNhnTATATTGTGATrCATTTCNnsfAGGCTTGATmCCANCTATTThrrAAAC 
ACCTTTCATITNCTACNTrCNGGGAA 

AATAACNCTTGTrGCTGTTGAAAGACCAATNCCNTTGTGAGTACAGAGGAATA 
CTNCCANTATNCGGCTTATANTTANCT 

AAAATTACAATACATAACAGGGAACCAGACNTGnTNCGTCNTTGATAATGAA 
CAGTTNTGGTNCTNNTGAAAAGTAATC 

CO^AATITGAATGGNTCGAAAGCAACACAATAAGAGTCTTrGCTTGATATTTG 
CTTCTCCAGAATATAAAATAACGCGTT 

AAAAAGACGTGTnTCTCTTCAATCGCCATCACAAATAAATTCAACAGAGTAG 

TAGATCCTGi 1 11 1 11 ICllCGCCACA 

TCAC ^'j '^^ 

>60gK 990bp in-house: 445-752 public: 1-140/753-990 PathoSeq: 141-444 

ATTACCGATCCGTCGGATnTAAAACCACAAAATTGCCTGCAITAGCAGAGCT 
AGATATITTCATAGGGTGCTATATATG 

CAAAGATCTA1TGAATGCACCCGTGAGGACACAATGTGATCACACGTACTGTT 
CACAATGTATACGAGAATmTACTTC 

GAGATAATAGATGTCCGCTTTGTAAAACAGAGGTTTTTGAAAGTGGTCTAAAA 
CGTGATCCATTGTTAGAAGAGATCGTC 

ATTAGTTATGCCTCCCTTAGGCCTCATTGATTACGATTATTGGAGATTGAAAAG 
GTGGAATCGAAGCAAGAGGTAGATCG ^ 



TGAGAAATCAGCCAATGAGTCAGCGCTGAATGGTAATAGAAATGTAAACAAC 
GATG1TGACGAAACTGTGCGCGTTAAAG 

ATCAACTGAATGCAGATAAACTAGGTGAAGAAAAAGGGCAAGCTCAACATGG 
GGAACAAGTNAAACGAGCAGACTACTGA 

AGTTATTCTGTTGCTATCTGATGATGAAGAGAATGGTTCTGATAGCCTAGTAAA 
ATGTCCTAnTGTTTTGAGAGAATGO 

AATTAGATGTACTACAGGGAAAGCNTATTGACGACTGTCTAAGTGGAAAGAGC 
ACGAAGAGGACGCCTACAGACATTITA 

TCCCCAAAAGCCCAACGACCGAAGCAAATCACCTCCnnTCCAACCAACAAT 
AGATACCANAACNCCTTCCCCCACCTA 

CCAGTTNNGGCGTCNACAACTCCCACAGCAACTCCGACAACTACATTGTTGAA 
AGCAAACGTCTCATCTCCATCCCAAGT 

GGCGCAAAGTACAGTAAACAAGGGCAAGCCATTACCTAAACTCGATATCAGCA 
GCTTGAGTACTAAAAAAATAAAAGCCA 

AGTTGAGTGATATGAAACTACCAACAACAGGTAGTAGGAATGAAATGGAAGC 
CAGATACTAGCATTACTATGTGATTTAT , 



>61gB 602bp in-house: 1-602 

ACCTACNOTCACNCNNGNCNCNGCAACACCANCNTNCCNCCNAAANAAAGTC 
TTCTTTGAATNAGACNnTCATCTATTG 

GAAGACTTGATCTACCTGAAGGTCTGTTTAAATTAGGATTGAAATGATGGTGTr 
GTTGTTGTTGTGGGTGGGGATTGTGG 

TGATCATGGTTTTGATOCAAGGGTGATCCAGGACCGCGCTGGlTlCrilCNAAT 
TCATCTAAAAGAGCCATATGTCTGGG 

TNGCCGTNGTNGTTGTTTCTGAGCAATGAATTCTTGTTGGATTTGCXrGNTGTC 

ACGTTCATNTTCTTGAITGAAnTAA 

TACCATTITGAAAATCATTAAAATCTGCCACTNCNACAGTTACTCTTCCTTTTGT 
CTrANGATACAAATNNTACCCTTTT 

NCACGTTTGGCGTTTGTGNCCAATGCAAACNCNGTNCCCCGGGGNGNGGGGNC 
CCCCCCNGCCNTTGTCCANANCCTGNT 

GGNTAGTTATNTTCGCGCTTACNATCCCCCCCACCCNNGCGNGNNCCGCCCCA 
CCCCCAACGTNCNCCTCCTCTCGCCNC 

NNCTTIGCNGCGCCTNTCNGGGCACCCCTrCNCCNCNCCTCNC F/<t 



>62gB 539bp in-house: 101-539 public: 1-100 

ATAATAGAATCTGATTTGAAATAATAGGAACCAACAAGAACAAAAAAAACAA 
GAAAAAAAAAAGATTTGTATAAACAAAA 

ATAGAATCAGAAACAAAAAGCTKTAGTTTGKGAAGAAATTGAAACAATCGGA 
AAACAACAATATCAAACTGMTGCCCAAT 

AACACTGGTATGTACCTAGATGGATTACCAAGATCTACTACATAAAATAATAR 
RGGAGITCCACTCACTCAAAGAGITCA 

AACCATGGGATAGCAGTGTnTGTATGAGACGTTACTACGATCAGTATTAACTA 
CnTGATCGAACmrGGGCMTAGAC 

AATCCACCCAGTTWTCTWCMCCTCACCACCMACMATGATAGTTATAGGTGAA 
TTTGAAAATWAAATACTATGGRAATGCA 



AATGCCAACCTTGACACCAATCATCCTGTA 





TTAAGCAAGTCAATCAAYSSWCMKRGCATRKTGCAATWTGCYWGWAYCAWA 
GSATGYAATCSATATTACMRRCYKTKSYY 

GRGRYYRKKRATACGCGAWMATWTWKAATCMMMGAGTCYWATYCTGCTGK 



>64gB 627bp in-house: 1-627 

TNCANCCTNCCATNO^CCCAGGCNNNGCCACCCCNGCCGNNCCCCCNTNTITC 
CCCCCCTCCTTNGTNGCCCTCNNGGTG 

GTGTTTGTGGTGTGACNAATAAANATGGTNTATCATTAGAANAGGACATTGCN 
NCGGAAATGACTGTCGACAATAAAGAA 

GCAAATATATACAATGGATTATGAANGTGCTAGGATGGAnTGAAAGTTTATC 
TGGGTTTATTCCAATGTAAAAATTATT 

TGTAATTGATATGGCTAATTATnTGCTCNATATNTATCACAAAAAAATGATTA 
AGTTCGAAATGAAATTGGCNTCCATA 

TATAAAATTTCTGACAGGAAGAGAAAATTCANGACNTGITGCCCNAAAAAAAA 
AACnTACCCCNCNTCNANTCNTGTNN 

GACTTAAC CCCCAA AAANAANANNGCTGGCGGCGGNAAAAAAATAGGAGGGG 
GCCGGNNGTITTrTAAAATrTNANNCTT 

GAATATGAACCCAAN>nTrGNhrrrC>riTITTOCCACNCCCCCTrCAAATITO 
TCCATGTTCCCAAGANNAGGGNGGNG 

GGGGNGGTTCCNNCmTAAACCNCCCCCCCCGGGTGGNGGGGNCCGTNTTNT 
TTCCGGNGGGGCNT 2.^ 

>65g 441 bp in-house: 1-441 

TTNCTTATGTAGATGTTGTrCATGAATTTGTATGAACGGACTATGGCTAGGATT 
TGGCCAATCTCGGTATTACTACNTTT 

TCAAGTTCAAAGATTGGGAAACTCGTGTATnTCGTACTGTCTACATnTCTTA 
AAnTGATAAACGCATAGTAAGTCTT 

TGCTTGATATACTATGAGATGATTAGAATTAAAAAGTAGACGACTAGTTTCACT 
AGATTTA1TGAAGTGTCAAAATATAT 

TCAGAT TGGT TGCAA CTGATGGTCTCGAAAATGCNACAGGATnTnTCCCCCA 
TTTTTTGCCAATTTTTGTCCNATAGA 

GTAGAAAGTACCNGTATNCNAATTGTCCCAAAAAGCGATTATAATCCGTACCA 
ATATTTCCAATnTCNnTAAACCCTG ^ 
TrCNCCTCNGTGTTGGTTTGnTGAAAACNTAACCANGGTG 

>8c_cp 890bp in-house: 287-890 public: 1-124/154-286 PathoSeq: 125-153 

ATGCAATTCTCATCCGGTGTCGTCTTATCCGCTGTTGCTGGGTCCGCTTTGGCTG 
CTTACTCCAACTCCACTGTTACTGG 

CATTCAAACCACTGTGTCACCATCACTTCATGTGAAGAAAACAAATGTCACGG 
AAACTGGAAGGTTACCACTGGTGTTAC 

CACCGTCACTGAAGTTGACACTACGTACACCACCTACTGCCCATTGTCAACCAC 
TGAAGCTCCAGCTCCATCTACTGCTA 

CTGATGnrCTACCACCGTTGTCACCATCACCTCATGTGAAGAAGACAAATGTC 
ATGAAACCGCTGTCACCACCGGTGTC 



WWTCMAAGA 




^'5 



71 



ACCACTGTCACTGAAGGTACTACCATCTACACTACCTACTGCCCATTGCCATCT 
ACTGAAGCTCCAGGTCCAGCTCCATC 

TACTGCTGAAGAATCTAAACCAGCTGAATCTTCCCCAGTTCCAACCACCGCTGC 
TGAATCTTCCCCAGCTAAAACTACTG 

CTGCTGAATCTTCCCCAGCTCAAGAAACCACTCCAAAGACCGTTGCTGCTGAAT 
CTTCTTCAGCTGAAACTACTGCTCCA 

GCTGTCTCTACCGCTGAAGCCGGTGCTGCTGCTAACGCTGTCCCAGTTGCTGCT 

GGTTTGTTGGCTTTGGCTGCTTTGTT 

TTAAGmATrAGAGCTTAAATCAAATAmACAAACAAAATTTTCATnTCCC 
CCCTTTCCCTTTCTTCATTCTTCAAA 

AAAGGGTTATTTACTATTAATTGATAAATTrATGGTTTCATGTTAATrrACCCTT 

TTCnTATAAACATTGGTATTATTA 

TrATCATCATrAGNTTTATTTATATmCGTGAGTnTrCGGmTrAATrAATnT 



TTGGTACTAG ^ ' J ' 

>80g3 669bp in-house: 1-652 PathoSeq: 653-669 

TTGCCAAAATnTATAAAAAATrGTCAAATTGAAAAGAAGTATITCCCAAAAT 
AAATTGTmTTCATCACAACCGGTTC 

ATATCGCCATAGNCCATTTTTAATCITAAGGTTGATACCAGTrAATTGTTGATTT 
CTCTGTTATAGCCCCTGTCTAAATC 

TGTCTATTTCTGGTATCGAATCAAAATGTCGCTCATAATGTGCATGTCGCAAAG 

ATGTCGTAAAGTTTTGATTTCATACT 

CATCTTAAAl 11 111 1 1 AGTGATTGGCATTTTGTTCTTTCACATAGrilll ATTTC 
TAGTTATCAACCTATCAAATACAC 

CTCCACAACAATGCATCCAAATAATAAAAATTCATTTAAATCAAAAAAGAAAT 
TTATAGATCGTCGAGAAGCCAAGTCTC 

AAGATATAAAACGTGCATTAACCCATAGGGCTAGATTAAGAAAGAACTATTTC 
AAACTATTAGAAAAAGAAGGGTTACAA 

GAGGAGAGGAAGCCTGAAGATGAGAACGATATAAGACCAACCAAGAAGAAG 
GGAATAAATnTGAAGAACGTGCAGCCAT 

TGTGAAACAACGTAAAGAGGAAAAACGTAAATTCAAACTAGCAAGTGTACAA 
GCAAAATTGGAAAAGATTGAATCTAATT <7 0 

CGAAAGAAAGAGCnTAAACCGTGACCAC r'j 



>8S33 481bp in-house: 1-431 



TTTGGATACATATTAAAAAnTAT 





>21g2 667bp in-house: 1-667 



TrrAGTirrATATTGATGATGTITITAAGTGCTrGTTTATCATGGTGGATGGAAA 
TTAGAATGAGTAAATrGAATGGAAA vivi/iiuuAAA 

ATCACTGCAACACCAACAACAACCACTGGTGGATACGAAAATTTAGTGrACAA 
ATTTCTGCCAAAAAAATACAATAAAAA "i/^^^aa 

CCGCTTATAGTCTTCTACTGACATAACAACACAAGTCAATAAATCAACAACTC 
ATAAACAATGTAGAC1TAATACTATCG 

CTTAArrATTTAAACTATAATAAATACCCTATAGTATTATGCCnTGTCAATGTG 
TGTAGAATTTGGTTATTACATATCC 

ATGTGTNATATATATGTTGATCAAAAAAACGCGATCTTCTCITTGGTGTAGTGT 
GrrACNCAAAAAATTCACTA>n-CTAG 

GTCNCTGANAATCACTTGAAAATCAAAAATrrGTTGAAATrGAATITCCTCMA 
YTITGAAATTTTGmGAAATTTmT 

mGCTTTACAAAAAGACTCCATmOTmCCAmCACAACCAAmCTTAAT 
TCCTCnnTCATAATTAATAACTA 

TCATTACTTACAACTACAAACAACTACGATCATTTCCTAAGAAAAAGCAACGA 
GGGCG AATTGAGACATTAATCCCCnr ^ ^, 

ATnTATCATCATGCCTTATACAGAAC /'/J 
>66g4 579bp in-house: 1-579 

CCCCGTTAACCACTTCTAGGGTATACCAnTCATCTGACTGAATAACTGGTTAG 
TCGAnT GTTGTT GAAGAAAAGTGAC 

CACCTAGiiiiriCTGCCAACATTTITrGCGATGAGCCGTCGACGCGTrGTCnT 
TrCTACCCCACGTTTAACAATCTTG 

CCAGTCAATTCCCTAGCCAAATAAACTTTAGACTCACAACTCTAACACTGACTC 
GTGCCCCCCTGTTTAAACTCTAAATT 

ACTTCACAGAGCCTTTACTACCTTAAATTTARGRTTWTSKAKKGTTTCTGTT^ 
TTGCAAATCAC CCTGAC TYGTrnT 

TirrCAGCCAGGTrmCGTrAAAATCTGACCAAAAAATITACRACTCCTATWT 
TTAAAACTCYAAAWWACAATTAAAAC 

TCAATrCAGACAAGTCCTrCTGCTCATrCTGAGTCTTCTCTATrGTCITTTGACT 
TTTTGTGTGTGACTATnrCATGAT 

CACCCCGmOTGCATITmTCAGTCAACTrnTCTCAAAATCAAGCCAAAAA 

AACACACCTTTAACTACCTATACAA 

CGCAAACCTATTCAAAACA 



Sequences with known function, C. albicans sequence NOT present in the public domain 
(ALCES/EMBL) 

>CFL (223c_cp) 165bp in-house; M65 

AACTATTGCCAATGGTAAATATGCCAGTGAAATCGAGAATTTTAATAAGTCGG 
TCCCTCTTAAGGTCCCATTCAAATTCA 

CTAATGCACAATTGGATCnTATGCTGCTAGCACACATAACCAAGAGCCAATA 
TCCTAGTAACGACGCACCATAGTAGAC T . 



>EF4 (29g3) parti 479bp in-house: 130-479 PathoSeq: 1-129 

CGCGAAGNNTCAATCATNTCAGAAGAAATGAAAGAAGGTACTCCGTTCnTAC 
TATTGTGGCAAGAATCCCTGTGA1TGA 

GGCATTTGGGnTTCCGAGGATATrAGAAAGAAGACATCCGGGGCAGCTAGTC 
CTCAATTAGnTTTGATGGGTATGATA 

TGTTAGATATCGATCCATnTGGGTTCCACATACTGAAGAAGAATTAGAAGAAT 
TGGGTGAATTTGCAGAAAGAGAAAAT 

GTTGCTAGAAGATATATGAATAATATCAGAAGAAGAAAAGGGTTATTTGTrGA 
TGAGAAAGTCGTCNAAAATGCTGAAAA 

GCNAAGAACTTTGAAAAGAGATTAGATTATCCNGTTNAACAGGCCATATGTGT 
GAAATTGTTTCCNAAAAGACAGATACN 

ANGTGGNCCOTATITGTrTAATATTCCACNACCAGTTAATGTTTTGATATNGAT 
OmTATATAGTCCAATOTTGAGAC "^1(00 

>EF4(29g3)part2 1706bp 

AAGTCATGCGATTGCAACAAGGATCACAAGAACCAGAAGTrCACGAACATTTG 
A1TAATTTGATTGATTCACCTGGGCAT 

ATTGACrmCGTCTGAAGTGAGTACTTCTTCGAGATTATGTGATGGTGCAGTT 
GmTGGTCGATGTCGTCGAAGGTGT 

CTGCTCACAAACAGTCAACGTTCTACGCCAATGTTGGATTGATAAGTTGAAGC 
CATrACTAGTTATTAACAAAATTGATA 

GGTTAATCACAGAATGGAAATTGTCTCCCTTGGAGGCATACCAACACAnTCC 
AGAATTATAGAACAAGTAAACTCTGTG 

ATTGGGTCAl'll'niGCTGGTGATAGACTAGAAGATGACnTGAATTGGCGTGAG 
GCTGGTTCTGTCGGGGAGTTTATCGA 

GAAGAGTGATGAAGACTTGTATTTCACACCTGAAAAGAATAATGTAATAnTG 
CCTCGGCAATAGATGGATGGGCAnTT 

CAGTCAATACATTTGCCAAAATATACCTGAAAAAATTAGGGTTCTCTCAACAA 
GCATTGTCAAAAACTCTCTGGGGAGAC 

TnTACTTGGATATGAAAAATAAAAAAATCATCCCTGGTAAAAAATTGAAAAA 
TAATAGTAAC AGTTT GAAGCCATTATT 

TGTTTCGTTGATTTTGGACCAGGTTTGGGCTGTTTATGAAAACTGTGTTATTGA 
AAGAAATCAAGACAAGTTGGAAAAAA 

TCATrGAGAAATTAGGGGCCAAAATCACCCCTCGTGATTTGCGATCCAAAGAT 
TACAAGAACTTGCTAAACTTGA1TATG , t-j ru\ 



CGAAT 





TCTCAGTGGATTCCTTTGAGTCATGCCATATTGGGGTCAGTGATTGAATACTTG 
CCAAGCCCCA1TG1TGCTCAGCGTGA 

AAGAATAGACAARAWWTWRRWCGRRMCSMYYTATARWRYWKTGKWrrCAR 
AAMTGSATAWWTCCAAACTAGTCGAMMCTT 

CATTTGKMAARRMKWTGCASRMAYSYSMKRGTWMACMCCCRKAAMCCMAT 
WGGCMATWKYAKNTGTMYCAAAATTGTTrG 

TCAATCCCCCAATGAAGACTTACCCAAAGCTAGTAATGCCCGCTACTGGAGGA 
TTGACGGCCGATGAAATCCAAGAACGA 

GGAAGAATTGCTCGAGAAlTAGCCAAAAAGGCATCTGAAGCAGCTGCTrrGGC 
ACAAAGAAGGTTCCCAAAAATGAAGAT 

GAGTrTGCCATrAAACCCARGAAAGATCCATTTGAATGGGAATTTGAGGAGGA 
CGATTTTGAGAATGAGGAAGATGAGAG 

CGATGCAAACGCAGTTGAAGAATCAACTGAAACCATAGTGGGTTTCACTCGTA 
TTTATTCTGGATCGTTATCTAGAGGCC 

AAAAGCTCACGGTAATTGGACCCAAATACGACCCTTCATrACNTAGAGACCAT 
CAAACCAACTTTGAACAAATAACCAGT 

GAAGTGGAAATrAAAGAnTGTTTTATATCATGGGAGGAGAATTAGTGAGAAT 
GGAAAATTTCCGTGCGGGTAATATTGT 

TGGGGTTGTTGGATTGGATAACGCCGTGTrTAAGAATGCCACAATrTGCTCACC 
GTAACGTGAAGATAAACCATACATTA 

AnTAGnrCAACATCAACCTTGATCCACAATAAACCAATTATGAAAATAGCA 
GTTGAACC AAC AAACCC AATAAAACTA -i 7 ) 

GCAAAATTGGAACGAGGATTAGAnT /''J ^ ' K- J 

>NDI (17c_cp) 807bp in-house: 1-614 PathoSeq: 615-807 

AACCTATTCCATAATGnTACTAGATCATTGATTAAAGGTGGTGGCAGACTTGC 
TACTACCAGAT CATTG GTCAACAACT 

CTACTAGmGGTTTTAAAAAATCAATTTAAGAAATATTCAACATCAACTCCTC 
CTAAGGTTGCCAAATCAAAATCTTCG 

ACAATTGGTAAAATATTCAGATACACrrnTACACTGCTGTGATATCGGTrATT 
GGTTCTGCCGGTITGATCGGTTACAA 

AAnTACGAAGAGTCTCAACCTGTTGATCAAGTGAAACAAACACCATTGnTCC 
TAATGGTGAAAAAAAGAAAACTTTAG 

TrATrTTGGGTTCTGGTTGGGGTGCTATTTCATTATTGAAAAACTrGGATACCA 
CCTTGTATAATGTTGNTATTGTCTCC 

CCAAGAAACTATrTCCTTITCACCCCATTGTTACCATCTGTrCCTACCGGTACTG 
TTGAATTGAGATCTATTATTGAACC 

TGTCAGATCAGTCACCAGAAGATGCCCTGGCAAGTTAnTACCTTGAAGCAGA 
AGCTACAAATATNAACCCCTAAAACTA 

ATGAGTTGACACTTAACAAAGTACTACTGTCCGTTCTGGTCATTCTGGTAAAAA 
TACTTCCTCTTCTAAATCAACTGTTG 

CCGAATACACTGGGGTTGAAGAAATCACTACCACCTrGAATTATGACTATTTA 
GTT GTTGG TGTTGGTGCTCAAACAATN 

CTAhTITrTCGGNAATCCTGGGAGNCGCOTGAGGAAimCAACCCCrrnTTGAA 

AGAANGNCCAGTGGANGCCNTCTGCN 

AATTAGA /-/Q 



>RPL27 (357cL) 560bp in-house: 1-560 



AAAAATGGCTAAGTTCATCAAATCTGGTAAAGTTGCTATTGTTGTAAGAGGTC 
GTTACGCTGGTAAAAAAGTAGTCATTG 

TGAAACCACATGATGAAGGTACCAAATCTCACCCATTCCCACATGCCATTGTC 
GCTGGTATTGAAAGAGCTCCATTGAAG 

GTTACCAAGAAGATGGATGCTAAAAAAGTTACCAAAAGAACTAAAGTCAAGC 
CATTTGTTAAATTAGTAAACTACAACCA 

nTAATGCCAACTAGATACTCATTGGATGTTGAATCATrCAAATCTGCTGTCAC 
TTCTGAAGCTTTAGAAGAACCATCTC 

AAAGAGAAGAAGCTAAAAAAAGTTGTCAAGAAGGCnTTGAAGAAAAACATC 
AAGCTGGTAAGAACAAATGGTTCTTCCA 

AAAATTACACTmAAGAAAGGAACCACCTTrATTTGAATGTTTGTAATATAGG 
TTGAATCAGAGAGACAAAGTAGAAGA 

AAATACAAAAAAGAGAGTATATCTGTATAGTATAATTTAATGGGGGTCTAATT 



>SADH (1 10c_aO 650bp in-house: 1-650 

AACCTTTTGAAACGATTAANTNCAATCAAACAATCTTATTCAAAAGTACTCGC 
AATACGTACAATGTCAA1TCCATCTAC 

TCAGTACGGATTTmTATAATAAAGCTAGTGGTCTTAATTTGAAAAAAGACTT 
GCCGGTTAACAAGCCAGGTGCTGGTC 

AATTGCTTTTAAAGGTTGATGCAGTTGGCCTTTGTCATTCAGATTTACATGTTCT 
CTATGAAGGTTTGGATTGTGGTGAT 

AATTATGTGATGGGCCACGAAATTGCTGGGACTGANGCTGAACTANGGTGAAG 
AGGTGAGTGAGTTTGCAGATGGAGATC 

GTGTCGCTTGNGTCNGNCCCCAl^GGATGTGGNCnTGCAAACACTGTCTTACT 
GGTAACGATAAATGTNTGNACCAANT 

CGmATTGGATirGmTrCGGATTGGNTrACAATGGNANGNTNCGANCCATrT 
TTGGTAGNNANGAGANCNANAACTTG 

GTAAAGATCC^^WAATGTACOTNCCGAGNAAGCTGCNCTTTTN^^^GNNT 
TATrGANTCNTACCAANGNTTTTAAGG 

NATGNAGNAGTGTGNCANCNTGGGAATAATAATACACNTTCTTNCCTGATGGN 
mTOTGCNGlTACNCNNTrCAAGNNNN 
CNNNAAGCNC T'^ 



Sequences with known function, partial C. albicans sequence present in the public dontain 
(ALCES) 

>ABPI (409cl0) 1435bp in-house: 842-1435 public: 1-382/779-841 PathoSeq: 383-778 

ATGGAAAAAATTGACATGAATACGTATTCAAACAATATCCAACAAGCATACGA 
TAAAGTTGTTAGAGGAGAACCAAATGC 

AACATTCGTCGTTTATTCTGTTGACAAAGACGCCACTATGGACGTCACTGAAAC 
AGGGGACGGATCATTATACGATTTTG 

TTGAACATTTTACTGATGGACAAGTTCAATn'GGTTTACCCAGGGTTACTGTTC 
CAGGATCTGACGTCTCCAAGAACATC 



TACTTACCACTTTATTCGTGCATTATT 






TrGTTAGGATGGTGTCCTGACAGTGCTCCACCAAAATTGAGATrGTCATATGAC 
AATAATTCTGCTGATGTGTCAAGAAT 

ACTGAGCGGATACCATGTGCAAATTACTGCAATGGATCAAAATGATITATACG 
TGAATCACTTCTTGAATAGAGNTGGTG 

CTGCTGCTGGTGCAAGATATTCCACTCAAACTTCCGGACTCAAAAAACCATCCC 
CTGCTGCACCTAAACCTACrrCAAAA 

CCTGrrOTTGCTAAATCTAGTrCTGCTrCAAAACCTrCATTrGTACCCAAATCTA 
CTGGGAAGCCTGTTGCTCCAGCTAA 

GCCAAAACCAAAGAACATCACCAAGGATGCTGGTTGGGGTGATGCTGAAGAC 
GTTGAGGAAAGAGACnTGACAAGAAAC 

CnrGGATAACGirCCATCGGCATATAAACCAACAAAGGTrAACATTGACGAA 
TTGAGAAAACAAAAATCAGATACAACT 

AGCTCAACTCCTAAAACATTCAAATCTGAACCACAAGAAGAAAAGAATGACG 
ATGATGGGCCATCCAAACCTTTATCGGA 

AAGGATGAAAGCCTATGATCACGACTCAAGTCGTGATGGAAGATTGACTTCTT 
TACCAAAACCAACGATTGGACATTCTG 

TNGCCGATAAATATAAAGCTAGTGCATCTGGGAATGGTGCTGCTCCTGCGnTG 
GTGCTAAACCAGCATITGGTCACAAT 

CAGTTGATTCAAGAAAGGATAAATTGGTAGGTGGTTTGTCGAGAGATnTGGT 
GCTGAAAATGGAAAAACTCCGGCACAA 

ATTTGGGCTGAAAAAAGGGGAAAATACAAAACAGTGGCCTCCGATGAGAAAG 
AAACTAACTCAAGTGAAAAAGTTGATGA 

GCCAGAGGAACATCATGCTGCCGACTrGGCCAAAAAATTTGAAGAAAAGGCA 
AATATTGCTGGCGATACTCCTTCCTTGC 

CAACTAGAAACTTACCACCAGCACCACCAGCACGAGAAACCGCAATTCCATCT 
AACGAAAAAGACAAAAAAGAAAAGGAA 

GAGGAAGAACAAGCTTCAGCACCATCnTGCTACTAGAAACTTACCACCACCG 
TNACAAAGACAACCTGAGCCCGACCAG 

AACCAGANGAAGAGGAGGAAGAAGAAGAAGAGGAGGCTCCTGNTTCAAGCTT 
ACCAGCAAGAAATCTCCCCCCAC Lff Ci^t^t-) 

>ADE12 (226c_af2) 993bp in-house: 1-646 public: 647-677 PathoSeq: 678-993 

NATAAGGAATGANCCAAAGNAGTGTANNAGNTAATAAGNTNAGANANTTCCA 
AAAGAANNAAAANAACCTAACNACANAN 

NNNTATTANATCCAGTAGAAANACAANATTAAGNNACCANTATCTrNNAAAG 
NTTNACGANACNTNGTrCNAATTGTTCA 

T^r^^GGGANGNACCCTCAACAACT^^^A^^^NGGGTAAACTTTAACNACACCATA 
GACATTTNTNATGGTTNTTGAGGAATA 

CCCAACCCAGTCANAAANACCAACAATACCAGTTGATAATAAGTGANGTATGA 
TAAGTACAGANATCAATATCCAACATT 

AAACGCATTAGCACCTTCANCCAAGATTmTTATTGGCAGCNATAGCTTCGTG 
CATNAAGTTGACGGAGTCTGACGACG 

AAIGGTCTCAAGGTirCACGGNATTTTTCAAATCTTGCCAATTCTTCCTTAGGA 
TCATATTCAAATTCACCGTATCmr 

TrGTCTACTCTCGACTAATCTCAAATATCTAGTITTGAATTCTTCCCAAGCrrCT 
GGATCAGGGTTGACTAAATGGTGGA 

CTCTGATACCTGATCTACrrGCCTrGGTTGAGTAAGTTGGACCAATACCTTTAC 
CGGTAGTACCTATTGAT7TCTTATTG ^ 




GNTGTTAATTCAGCATCTTTCAATTTATCAGCACGTTGATGGAAGTCAAAGACC 
AAATGAGCTCTAGATGAAACAAACAA 

TCTATCACGACAATCTAACCCTTTTGCTTCCAAGTnTCCAATrCAGCAAAGAA 
GGAAGGAACGTGGATAACAACACCAG 

ATCCAACTAAGTnTGACAnTAGGATTGACCAAACCAGAAGGTAACATGTGG 
AAGTCATACTTGACTTTACCAACAACA 

ATCGTGTGGCCAGCATTGTTACCACCTTGACATCTGGCACAAACATCGATATCA 
TCACATAATAAATCGACTAATTTACC p f^n /.^.\ 

nTACCTTCATCCCCCCATTGAGATCCTAATAC W^C^ok+j 

>CDC48 2448bp in-house: 95-220/285-1340 public: 1-94/221-284/1341-1373/1783-2273 
PathoSeq: 1374-1782/2274-2448 

CGnrCCGAWAGCTGCACCTGAGTTGGCACCTGCTGCTGAACCATTATCAGTG 
GCACCAGCATnTCATTGAATCTAAAG 

CTAGAAAATTGACCTCTTGAGGCTTGCAATTGITGAGCGTAAGACTCATAACG 
ACGTAATTCAGCGTCTGAAACAGATCT 

rnTGCGGTCTTCATAGCCTCITCAAAGTGAGCTCTGGTAATGTAAGGCACAGG 
GTCTTCTTCrTCAACTTCATCTACCT 

TCATATCAACATCTTCAGrrATCACCTGTTCCTTITGCTrCTTTAATCTrGTrAAT 
CTTrACTTGGGCTTCAATAGAGTC 

TTTAATAGCAAATTTAGCAGATCnTGAACAATATAAGACAAATCTGCACCCG 
AGAAACCGTGAGTGATCnTGGCAATTT 

CGTTCAAGTCCAAACCAGGTTCTAATGGAGTGTTTCTCAATTGAGCTTGTAAAA 
TAGACAATCTAGCTGGCTCATCTGGC 

AATGGGACATAAATTAATTGATCCAATCTACCTGGTCTCAATAATGCAGGATC 
AATTTGATCTGGTCTGTTAGTGGCACC 

AATGACAAACACATTCTTCTTAGCATTCATACCGTCCATTTCAGTCAACAATTG 
ATTGACCACTCTGTCGGAGGCACCAC 

CAGCATCACCGTGAGAACCACCTCTAGCnTGGCAATGGAGTCCAATTCATCC 
AAAAACACCACAGTAGGAGCAGCAGCT 

CTGGCCTTGTCAAATATATCACGGATATTAGACTCAGATTCACCATACCACATA 
CTCAACAATTCTGGACCnTGACAGA 

AATGAAATTAGCAGAAACTTCAGTAGCAACAGCCnTGGCCAAAAGTGTCTTAC 
CAGTACCTGGTGGACCAAAGAACAAAA 

CACCTmGTTGGTGCCAATCCGAATrnTGGTATTGATCTGGATGTAAAACAG 

GATACTCCACGGTTTCTTTTAATTCA 

TrCTTAATGTTGTCCAAACCACCAATATCATCCCAAGTGACATTAACATnTCA 

ACAACAGTTTCACGCAAGGCAGATGG 

GTTGGAGTTTCCGAGAGCAAATCTGAAGTTGTCTTGAGTGACACCCAAAGAGT 
TCAACACTTCAGTATC/VATGGTTTCTT 

CTrCCAAGTCGATAAGATCCATCnTrrCACGGATrTGTTGCATAGCAGCTTCTG 
AACATAATGAAGCAATATCAGCACCA 

ACGAAACCATGTGTITCAGAAGCGATGGCTTCCAAGTCAACATCATCAGCCAA 
TTTCATATrcnTGTGTGGATTCTCAA 

AATCTCTAAACGTCCTTCAGCATCCGGAACACCAATGTCAACnTCTCTGTCGAA 
TCTTCCATCAATAGAATTTTGAAATC 

TTCTCAAAGCAGGGTCCTOTTAGTAGCAGCCAATTAACAACCTACAITAGATCT 
GGCCCTTCATACCATCCATAAGGGGT ^ 



TTGGATTCAGACTCACCAGCCAmT * n i c 1 1 aaa 

AGACATAATrrCTGGACCATTTATrAAGAAAAAGAAGGCACCTGTITCATrGG 
CCACTGCTCTTGCCATAATGGTnTAC ICAITOG 

§Xg^tmSS^^^^ 
^c'^^aS 

taaatcaccttttctcactggtctat 

^™VF^^^^^'^^^°^^^°'^aaaggtcgaataaggaaccat^^ 

CCTrCAACAGTATCAGCAATTGGCAAT ^^^haaia 

ACTGAGATrCTGTTGGCATATITAATATCAGGACATGGATGGACAGTAACGAT 
ATCTCCCAATCTGACACGCAAATTGIT ^auaujaacoat 

ACGAACACATCTGTTAACTCTAGCAACGCCATCAGGCATATCATCATCAGCTA 
AAACG ATC AAC ACTGTOTCCnrCTCT '^^1^/11 ^ AUC 1 A 

ATCTACAGCAGAAGCACCAGAAGCATCAAAATGTTGmnTATCrrC V5 Cc<^^) 

>CIT (99g3) parti 1435bp in-house: 803-1435 public: 1-333 PathoSeq: 334-802 

TCCAAAACTTATTGCTTAGCTATACGGTGTAATGGACCAGCTTAACCATrCAAA 
CCAGCAGCTAATGACAAGAATGGAGA ^'vii«^aaa 

ATCTCATTAATrCAACANAlTCCTTGGTGTCACCAAAACCTAACANACTGGCCA 
AGTTAGCACCGTAATCCAAnTGGAG ^aciuucca 

TCAATGGCAGCTGGCAATTrACCATCGTGGAAAACGTTTCTGTAAATCTTAGCA 
GCAATGGTTGGCAATrrAGCTAACAA uiaaaici I AUCA 

ATCGATGGAATCTrCGTAAGTGTAmCCAGTATTCGGATTTGTTGGCACC'nT 
AGCATAAGCTTGGGCAAATTGGGATr 

CAGATrCCAAAGCAGTAACGGCAATGGAGAATTGAGCCATTGGGTGCAAGTGA 
GATGGAGATCTGTCGATCAATTCTrCA aauiua 

ACGTGCmGGTAATGCTGATCTAGCAGCAAATTCTTCGGATAAAGCCrrAGTT 
TGGGCGTCAGTTGGAACTTCACCAGT 

CAACAACAACCAGAAAAGAGCTTCTGGTAATGGTTCTTCACCACCrGGTGCTIT 
TGGCAATTCTnTTGAATGTCTGGGA 

TGGTTCTrCCTCTGAAACGGATACCTrCAATTGGGTCCAAAACAGAACCTTCCC 
AAACTAAACCTTTGATACCTCTCATA 

CCACCGTAAGCTTGTTCTAATAAAACTTCACCCAATGACAGTnTACCGTGTTC 
imTTGAATrGITTAAaTCTTCAG 



CTrrGGCTGGCAAGATTTCTTCCAATCmGrnTAAGGTCTAAGAAAGCAAGT 
TAGTAAATTATCTTTGTATATACGTT 

AAAAGTAAAATCCCAAGCACATTGCCCGGATCCTCAAAAAAGTGAATACATAC 
TGGTTCAGCAGAAGCATATGTTCTGAT 

GCTG ll ' ni GAAAGTGCTCTTGGCTACGTTGGTTGAACGTTGAAITGATCTGAA 
TGCAGACATTGTTTGATAAATATACT 

ATATTCTAAAAGAAACTTAAAAGATGGAAAAAGGAAAAGAAGAAATGGAAAA 
AAAAAAGTAAAAGGAAATCAATTGCAAA 

TATATACAAAAAAATCCAGCANGAAATCAGTrGAAATTTATATTCCAAATTnT 
GGTnTAATGGCTCTTGAAGTTGTGG 

TGAACA Al 11 11 1 1 1 1 111 l AC 14 ' ri l4 CCTCATGGATTTACTTTAGTnTGGGTC 

TGTCGNGCTGCCGTACAACnrCC 

TGNGAAAATTGAl'llUrrrriCTTCTGGNGAGGATTITTTGGCGTTCTITGTT^ 

ACTTCTTATTTGTCCCACTANATGG 

AGAGCAAAAAAAAAAGTrrGACTTTTACTnTAACCAATCAATnTCCAGAATN 
TGAACGAGAAAAAGGGAAAAN 4^^CotM-) 

>CIT (99g3) part2 327bp public: 1-327 

CAAAGAGAATITGCTCTTAAACATATGCCAGACTACGAATTGTrCAAATTGGTT 
TCAAACATTTACGAAGTCGCTCCAGG 

TGnTTGACCAAACACGGTAAGACCAAGAACCCATGGCCAAATGTGGACTCCC 
ACTCTGGTGTCTTGTTACAATACTACG 

GTTTGACTGAACAATCmCTACACTGTCTTGTTCGGTGTTrCCAGAGCCnTGG 
TGTCTTGCCACAATrGATCTTGGAC 

CGTGGTATCGGTATGCCAATTGAAAGACCAAAATCnTCTCCACTGAAAAATA 



CATTGAATTGGTCAAAAACATCAACAA T". 
AGCTTAA f'j 



S 



>H0L1 (409c5) parti 695bp in-house: 98-695 public: 1-97 



TTTCTGGATTATCATGTrAnTGGTTAGCTANACGGAATAATGGGATAATGGAA 
GCTGAATATCGATTATATNTATTAGT 

TATCACTTrAATCATTTAACCCGTAGGGTTAATTATG'nTGGTGTTGGTGCCGCT 
AGAGAATGGCCATGGCAAGTGATTT 

ATGTTGGATTAGGTirCATTGGGTTrGGTrGGGGATCAATTGGTGATACTrCAA 

TGTCTTATTTAATGGATGCTTATCCT 

GATATTGTCATTCAAGGAATGGTGGGAGTAAGTATTATTAATAATACnTGGCT 
TGTATTrrCACTTTTGCTTGTTCTTA 

TrGGrrAGAIGGATCAGGAACACAAAACACATATATTGCCTrGTCAATTATTGA 

nTTGCTACCATAGCATTGGTnTCC 

CCTirTTATATTATGGTAAAACATTTAGAAGGAAAACTAAAAGACTrTATGTTr 

CAATGGTTGAATTGACTCAAGGGATG 

GGATAAGAGAGIGAGIGGTAAAAGAATTTTATTAATGATACATTrATTATTAG 
AATTACTACTATGGAAATCCGAGTCTG _ 



TGTTTTTrTTAGMGTATATmAGACGTAmAGAGTTGGTmCTCCTTTGTA 
CTNTAnTAGCATnTATAATATAT 

TAAATTCAAGTTGCATTAATATATATAAATAAAAAAACCTACNAANAAAAAAN 
GN 



>H0L1 (409c5) part2 762bp PathoSeq: 1-762 

GATCAGAATAATGAGGACnTATACCTGGAACACTCAATATCTATrCCTTGGAA 
GTTGACTCTGAAGATGAAAACGTGAG 

TCATTACGATGCTTCCAGTCGACCAAAAGTGAAAACAAAAGGCAATATAATCC 
TCTTCCCACAACCATCGAATTCATGCA 

ATGAT CCATTAAATTGGAGTAAATGGAGAAAGCTAAGTAACTrnTTATTGTCA 
TTnTATTACTGCrnTACAGCAGCT 

ACTTCAAATGACGCTGGATCAATTCAAGATTCACTTAATGAAAAATATGGAAT 
TAGTTACGACGCAATGAATACAGGGGC 

AGGCGTmATirrTGGGTATTGGATGGGGTACnTrCTTmAACACCTGCTrCG 
TCGTTATATGGTCGAAAAATAACAT 

ACTTTATATGTATCTITCTrGGTrTATTAGGCGCTGmGGmGCCTTGGTTAA 
AAGCACTTCCGACTCAATTTGGTCG 

CAATTGTTrGTTGGTATTAGTGAGAGTTGTGCTGAAGCTCAAGTACAATTAAGT 
TTATCAGAACTTTAmTGCCCATAA 

CCnTGGTTCTGTGCmCGTCCTATATrGTTGCAACTTCCGTAGGTACrrACTTA 
GGACCTTTAATTGCAGCCTTrATrG 

TTCAAAACATTGGrnTAGATGGGTTGGTTGGATTGCAGCAATTATTAGTGGTG 
CATrATrGTTCGTAATrGTTTTTTGT 

TTAGATGAAACCTATTTTGATCGAGCAAAGnTACCAAGCCA 



>ESP1 I458bpin-house: 889-1458 PathoSeq: 1-888 

CGATrrTCAATTACAAGATATTrTGCATCATGTrGAAAGCAAATGGriTGGTGG 
GnTATTTCAGGTATTTTCACTAATG 

ACAATGACGTrGAAAATGAATCCAAGAACOTGnrCATAAATTCAAACAAGAT 
TrAATGAAAATnTGAAAGATTGTTTA 

ACCGTAAGTGACGATAAATCGAATATAGAGAGGTTTCTTCAGTTTAATGAA'nT 
ATTTATTACTGCmTACTCAATGGA 

GGAATATAATTATGAATTGGTTGATGAnTGATAAAAnTATAACTATAAATAT 
GAATTCTCATGGCAGAATAGTTAATT 

TTGGCACTAATGTTAAAATTAATANATTACACGAATTAATrAAGAATTTGATTG 
ATAAAGTTAATAAAAACAAACAAAGA 

TGTGACTAGCAACAACAAAAACAACAGCAACAACAACAGCAACAACAACAGC 
AACAGCAACAATTCCCAACATATTGGTT 

TTGATACCTAATGCCAACTGNTCCAATTTCCCAATGGGAAATCGATGGAANTTC 
NTTCGTAAGTAAAATCCAATTTCAAG 

GAATNGCAATCAANTCCTTANGTTACTTGATCTAGTCAAATCAAACACCAATA 
ACAAGAACAAGTTAATGTTTGTTGATA 

AATCTAATTTGTATTATTTGATTAATCCCAGTGGTGA'nTAATTCGATCAGAAA 
ATCGATTCAAAAAACTATTTGAATCA ^ 





AATCATTTATGGAGAGGGGAAATTGGAAAATTATCAAGTAATGAACATGAAGA 
TTATCAAGATTCAATATTATGTGAAAT 

CITGAAAAGTCATITAmGTTrATATTGGTCATGGTGGTTGTGATCAATATATT 
ANAGTATCAAAATTATTTAAAAAAT 

TACCTCCTAGTTTATTGTTAGGTTGTTCATCAGTTAAATTAGATAATTGTAATTA 
TAACTATAATTCCAGTATGTTACAA 

CCACTGGGTAATAnTATAATTGGTTGAACTGTAAATCGTCAATGATACTCGGG 
AATCTATGGGATGTTACTGATAAAGA 

CATTGATATTTTTACACTTTCATTACTACAAAAATGGGGGTrAATAGATGATTA 
TAATGGCAGTGGCCATGATTATGGTA 

TGAAGAAATTGGATTTGACTAATTGTGTTGTrCAAAGTCGAAGTAAATGTACTr 
TGAAATACTTGAATGGATCAGCACCT 

GTGGTTTATGGTCTACCAATGTATTTAAAATAGACATrCTGTTTGCATATAAGT 

TTATATAmTAATAATAAGAAAAAG 

AGCATAAmGGATCirGATTTTGTATTGmGGmGTTATGAACAAATTTTGC 
ACCCAATCACTATCGAACTTTCTTT 

TTtAAACAGAGAACATTTAATCAACATTTATGTTACATTrAAGCGTTTAAATAC 
ATATTTGTGTrAGATAGTTATATAAT i r \ 

GTrrOATGCAAACATACA /'/^ ^6 CCo'^^"^ 



>FAL1 (190g3) 1439bp in-house: 1-770 public: 861-1299 PathoSeq: 771-860/1300-1439 

CTTCTTTTAGAGACAATGCAGTGGrnTCTTACCAGATGCATGACCCCCACCCA 
ATAAAACTATAATCGATCTATTCACA 

GTATTTGATGCCATnTGATGGTGATGAATGATGTGATGTGATGCTCATCTTAT 
TGGGAGTTTCAAAAAAAAAAGTTACA 

CTCGAAAAAAAAAAAATAGCATTATAAATAGAAGCTTTACTATCTTATAGAAC 
AAAACAAAAAACACTATCTTCTAATTA 

ATAATGGATGATTTTGATAGAGATTTAGATAATGAGTTGGAATTTAGTCATAAA 
TCAACGAAAGGAATAAAGGTTCATCG 

CACITITGAAAGTATGAATTTGAAACCTGATCTTTTGAAAGGAATATATGCCTA 
TGGATTTGAAGCACCATCTGCTATTC 

AATCTAGGGCTATTATGCAGATCATCAGTGGTAGAGACACAATAGCACAGGCA 
CAATCTGGAACTGGTAAAACTGCTACT 

TTTTCTATTGGTATGCTTGAGGTTATAGATACTAAATCAAAAGAGTGTCAAGCA 
CTTATCTTGTCTCCTACTAGAGAGTT 

GGCAATTCAAATACAAAATGTGGTCATGCATTrAGGAGATTATATGAACATTC 
ACACCCATGCCTGTATTGGTGGGAAAA 

ATGTCGGTGAGGATGTTAAGAAATTGCAGCAAGGGCAACAAATAGTTAGTGGG 

ACACCAGGTAGAGTGATTGATGTGATA 

AAAAGAAGAAATCTACAAACTAGAAATATCAAGGTTCTTATnTAGATGAAGC 
TGATGAACTTTTTACAAAAGGGnTAA 

AGAACAGATCTACGAAATCTACAAACATTTACCACCTTCGGTTCAAGTAGTAG 
TTGTTAGTGCCACATTGCCACGTGAAG 

TATTGGAGATGACAAGTAGTTTACCACTGATCCAGTGAAAATCTTGGTGAAGA 
GGGATGAGATTTCGCTTCTGGGAATCA 

CACAATATTATGTTCAATGTGAACGTGAAGATTGGAAGTTTGATACACTATGTG 
ATTTGTATGACAACCTTACAATAACT ^ 

M ^7 



CAAGCAGTGATATnTGTAATACCAAATTGAAGGTGAATTGGCTTGCTGATCAA 
ATGAAAAAGCAAAACTTTACTGTTGT 

GGCAATGCATGGTGATATGAAACAAGATGAACGAGATTCAATTATGAACGATT 
TTAGAAGGGGGAATTCAAGAGTATTAA 

TATCTACAGATGnTGGGCAAGAGGTATTGATGTCCAACAAGTCTCGTTGGTAA 
TAAATTATGATTTGCCCACCGATAAG 

GAAAACTATATrCATAGAATrGGACGATCAGGTAGATTTGGTAGAAAGGGAAC 
AGCTATAAACTTGATAACTAAAGATGA 

TGTGGTCACnTAAAAGAATTGGAGAAATATrATTCAACGANAATTAAGGAAA 
TGCCAATGAATATTAATGATATAATG ^ 

>FBP1 (40c_a0 638bp 

AACGTTGGCCTGGCCCAGTTAArrCCGTTTCCAAGCAAATGAATGTCGATACCG 
ACATCATCACGTrGACCCGTTTTATT 

TTACAAGAACAGCAAACTGTTGCTCCCACCGCCACCGGTGAGTrGTCGTrGTTG 
TTGAATGCGCTTCAAnTGCATTCAA 

GTTrATTGCCCACAATATCAGAAGAGCTGAGTTGGTCAACCTTATrGGTGTTTC 
TGGCTCTGCCAACTCTACCGGTGATG 

TTCAGAAGAAATTGGATGTGATTGGTGATGAGATCTTTATCAATGCCATGAGAT 
CTTCCAACAACGTCAAGGTTTTGGTT 

TCrGAAGAGCAAGAAGACCTTATrGTGTrCCCAGGTGGTGGCACATATGCTGTT 
TGTACTGATCCAATTGATGGGTCGTC 

CAATATCGATGCTGGTGTITCTGTTGGTACGATTnTGGTGTGTACAAGTTGCA 
AGAGGGGTCTACTGGTGGCATCAGCG 

ATGTCTTGCGTCCTGGTAAGGAGATGGTCGCTGCGGGGTACACCATGTACGGT 
GCATCTGCCCATTTGGCATTGACTACA 

GGTCACGGNGTCAATCTTTTrACTTTGGATCTCANATGGGTGAATTTATnTGC 
CNATCCAAACTTGGAAGTTCCAGA <- / c 

>GAL2 (360c6) 1004bp in-house: 625-1004 PathoSeq: 1-624 

TCCATmCCCTmCCTCTrnrCTACATCATCCTCACANCAATITCAAATATG 
TCTCAAGACAACGTCTCATCAACAT 

CTACAGCTGAGGCTGTAAATAATGAAATCAAAGTCAAAGATGAAnrCCACAA 
GAAGAACAAGCTCATACTAGTTTAGAA 

GATAAACCAGTGAGTGCATACATTGGTATCATCATTATGTGnrCCTTATTGCC 
TITGGTGGTTTTGTTTTCGGTTTCGA 

TACTGGTACCATrrCTGGTmATTAATATGTCTGACTTnTAGAAAGATTCGGT 
G GTACT AAAGCTGACGGTACTCTTT 

ACTirrCCAATGTCAGAACTGGTTTAATGATTGGTrTGTTCAACGCTGGTTGTG 
CCATTGGTGMWTTATVCTTGTCYAAA 

GTCGGTGATATGTATGGTAGAAGAGTTGGTATCATGACTGCTATGA1TGYCTAT 
ATTGTTGGTATTATTGTTCAAATTGC 

TTCTCAACATGCTTGGTATCAAGTCATGATrGGTAGAATTATYACTGGTCTTGC 
CGTYGGTATGTTATCAGTTTTATGTC 



CTTrGTTCATTTCCGAGGTTTCTCCAAAACATTTGAGAGGTACTTTGGTGTGCTG 
TrrCCAATTGATGATTACCTTGGGT 

ATCTTCNTGGGNTATTGGCTACCTATGGTACTAAGAGTTACTCAGACTCTAGAC 
AATGGAGAATTCCATTAGGnTATGT 

TrCGCCTGGGCTTTATGTnrGGTTGCTGGTATGGTrAGAATGCCAGAATCTCCA 
CGTTACCTTGTCGGTAAAGACAGAAT 

TGAAGATGCTAAAATGTCACTTGCCAAAACTAACAAGGTTTCTCCAGAGGACC 
CAGCATTATACCGTGAACTrCAATTAA 

TCCAAGCTGGTGTTGAAAGAGAAAGA1TGGCCGGTAAAGCATCTTGGGGTACT 
TTATrCAATGGTAAACCAAGAATCTTT 

GAAAGAGTTATrGTTGGTGTCATGTTACAAGCCTTACAACAATT -p Lj <7 fC^^-^J 



>KGD2 (98c_cp) 334bp in-house: 139-334 public: 1-138 

TTCTAACAACAACATCmCTrGGATCTTCAATCAATrCCTTGATGGTTCTTAAG 
AAAATAACAGCTTCACGACCGTCAA 

CTACTCTGTGGTCGTAAGTCAATGCTAAGTACATCATTGGTCTAGAAACGATTT 
GTCCGITAACAGNAATTGGTCnTNT 

TrAAAANTGTGTAAACCAAATACGGNAGTTTAANGCATmTATAATTGGGGT 
ACAGTATAATGATCCAATAACACNGNC 

ATTANAAATAGTOAAAGAACCNCCGGTCATATCTTACAAAGTCAATTTACNAT 
TTCTGGCTTTNTTACNCAAATTANANA 

TTTCCTTTTNAATA £"0 



>MAA(249c_a0619bp 

AACCCCACCTTCAAAGACAAAGAAGATTTCGTCAAGCAAACGAATGTCAGAGC 
AGAAAAGAACCAAGAACTAATCAAATT 

TGCCCGTGACAACCTTAACCATTTACCATTCACCGAAAAAGACGGAGGTGCAT 
GGGAAAACTATGAACGAATGATCAGTG 

GTATGCTCTACAACTGnTACAAAAAGAATTGGAAACAACACGTATGTCTTGC 
AGAGACTACATGTTGGACTACGGCAGT 

TTCAGAACTAGAGATTATAAAACAACCCAAGAATTTCTTGATGCAAAATACAA 
ACATtTAGAAAGnrCATTGGACATGT 

TGGCAAAAATGCAmATGGAATATCCAATCTATnTGATTATGGGTTTAACAC 
TTATTTGGGTGATAATTTCTATTCCA 

ATTACAATTTGACAATTTTGGATGTTTCCATAGTCAGAATrGGTAATAATGTCA 
AGTGTGGTCCCAATGTATCTATCCTT 

ACCCCAACACACCCAGTGGATCCCACTTTGCGCTATGATCAATTGGAAAATGC 
CTTGCCTGTGACGGTGGGTAACGGGGT 

CTGGTTGTGTGGAAGCTGTACCATrCTTTGGTGGGGTGACAGTANGTGATGGCA 




GCATT 



>MEG1 (55gl) 1380bp in-house: 1-368 public: 497-1096 PathoSeq: 369-496/1097-1380 



AATTACAATCTGGnTGTrACTACCATATCCCATrAGTGTTATTGTCATrGTAGA 
TATT GATAATG GTTAAAGGATTGGT 

•nrCAiiiiiiGTGTAAT GAATGAGC CAAAATAAAAAATCAATTCGATGCGATG 
CAA TGAAGnTAATAA-AAl'l'irri'lT 

mcmATrrCTTTrAATCAACCCNNCNATCNTrAAATTGAATCAATACCNACC 
ATTAACATACTTCTATATNCNTATA 

TITNTITrACAAAATATChrrGGGGNAGANAACAACTAGTGNTNCNAAAACAAA 
ACNACNTCCTTATCCNTTATTAAAANA 

NATTTCNTCCCCAGGNGGGNAnTAAGAACCGTrCCCAGANNATCATCATCAT 
CATCATCACAAAAGAAGAAATCATCAA 

'^£!^*^^°'^^^''''^^GATGAAGACGACGAAGAAAATGGTGGCGGTGAAGG 
ATnTTAGATGCTTCTAGITCAAGAAAG 

AmTACAATTGGCAAAAGAACAACAGGATGAATTGGAACAGGAAGATGAAA 
CACAAAATCACCnTCATTTGTTCAATC 

ATTAAAAAATCAACAAATAGATAGTGAAGAAGAAGAAGAGGAAGATGATTAT 
TCAGAnTGGAAGAAGAAGAAGAAGTTG 

AAGAGATATTATATGATGAAGAAGATGCAGAAnTATTCCCAAAGATGCAGAA 
TTAnTAATAAATATrTCCAATCCAGC 

GGTGAAGATAATAATAATGATGATGATAATTCATTTCAACCAACAATAAAnr 
AGCTGATAAATTITNAGCCAAAATTCG 

AGAAANAGAATCCCCACAACAACNACAACAACAGAGITTTCCAGATAATAGT 
AATGA AGATG CCGTATTGTTACCACCAA 

AAGTCATnTAGCn-ATGAAAAAATrGGTCAAATnTATCAACITATArrCATG 
GGAAATTACCTAAATTAnTAAAATr 

TrACCAAGTITAAAAAAITGGCAAGATGTATTATACGTGACAAATCCAAATAG 
TTGGACTCCTCATGCCACATATGAAGC 

AACTAAATTATirGTGTCGAATTTATCAAGTAATGAAGCTACAGTnTCATTGA 
AACTATCTTGTTGCCACGATTCCGTG 

ATrCTATrGAAAATTCCGATGATCATTCATTAAATTATCATATTTATCGAGCATr 
A AAAAAAT CATTATATAAACCAGGA 

GCTirmCAAAGGGTTCTTGTrACCTTTAGTCGATGGTTATrGTTCTGTACGTG 
AAGCCACTATTGCTGCTTCAGTGTr 

AACTAMGTITCTGTCCCTGrnTACATTCATGTCATTATrGTGGCGTACTGATG 
AATAAAAAACGAGAATCACCTGTAT ^ 
TTGTCCTACGGCGAATATAA -/-/J bZ 



>RNRI (38)2562bp in-house: 1-2562 



ATGTATGnTATAAGAGAGATGGCCGTAAAGAGCCAGTACGTTTCGACAAAAT 
CACTGCCAGAGTTCAAAGATTATGTTA 

CGGTrrGAATCCAAACCACGTTGAACCAGTrGCTAITACCCAAAAAGrrATATC 
AGGTGnTACCAGGGGGTTACTACTA 

TTGAGTTGGACAACTTGGCTGCAGAAATTGCTGCTACAATGACAACAATrCAC 
CCAGATTACGCTGTCTTAGCCGCTAGA 

ATTGCCGTATCAAAnTACATAAGCAAACCACCAAACAGTATTCCAAAGTGTC 
TAAGGATTTATATGAATACATTAATCC 

TAAGACTGGGTTACACTCTCCTATGAnTCCAAGGAAACCTACGACATCATTAT 
GGAACACGAAGATGAATTAAACTCAG 



CCATTGmACGACAGAGATTTTAACTACAATTATmGGGTTCAAGACTTTGG 
AAAGATCATATTTGTTACGTATCAAC 

GGTAAGGTTGCTGAAAGACCACAACATTTGATCATGAGGGTTGCTGTCGGTAT 
TCACGGTAATGATATACCAAGGGTCAT 

TGAAACCTATAACTTGATGTCTCAAAGATTCTTCACCCATGGTTCTCCTTGTTTA 
nTAACGCTGGTACACCAAGACCAC 

AAATGTCCTCATGmCTTGCTTGCTATGAAGGATGATrCTATTGAAGGTATTT 
ACGACACTTTGAAATCGTGTGCnTG 

ATCTCAAAAAGTGCTGGAGGAATCGGnTACACATCCACAACATTCGTTCTACC 
GGTGCTTACATTGCTGGTACCAATGG 

TACTTCTAATGGTATTATTCCAATGGTAAGAGTATTCAATAACACTGCACGTTA 

TGTCGACCAAGGTGGTAACAAGAGAC 

CTGGTGCCTrTGCCTTGTACTTAGAACCATGGCACAGTGACAiririGATITCA 
TTGATATTAGAAAGAATCACGGTAAA 

GAAGAAATCAGAGCCAGAGATTTGTTCCCAGCnTGTGGATTCCAGATTTGTTC 
ATGAAAAGAGTTGAACAAAATGGTGA 

CTGGACTrTATrCTCACCAAATGAGGCCCCAGGCTTGGCTGATGTITATGGTGA 
CGAATTCGAAGAATTATACACCAAAT 

ACGAAAAAGAAAACCGTGGTAGACAGACCATCAAAGCTCAAAAATTGTGGTA 
TGCTATTTTGGGAGCCCAAACTGAAACA 

GGTACCCCAnTATGTTATATAAAGATTCATGTAACAACAAATCCAACCAAAA 
GAACnTGGGTATTATCAAATCTTCCAA 

OTGTGTrGTGAAATTGTTGAATATrCTGCTCCAGATGAAGTTGCTGTTTGTAA 
CTTGGCTTCCATTGCCTTGCCATCAT 

TTGTrGAAAATGATGAAAAAAGTACTTGGTACAACnTGACAAATTACATCAG 
GTCACTAAGG1TGTCACCCGTAACTTG 

AACAGAGTTATTGACCGTAACCATTACCCAGTCCCAGAAGCTGAAAGATCAAA 
CATGAGACACAGACCAATTGCnTGGG 

TGTTCAAGGTTTGGCTGATGCCTTTATGGAATTGAGATTACCATrTGACTCTCA 
AGAAGCTAGAGAATTGAACATTCAAA 

TTTTTGAGACTATCTACCATGCTGCTGTTGAAGCTTCAATTGAATTGGCTAAAG 
AAGAAGGTGCCTACGAAACCTATCCA 

GGTTCTCCAGCCTCTCAAGGTTTATTACAATTTGATTTGTGGAACAGAAAACCA 
ACTGAATTATGGGATTGGGATACATT 

AAAACAAGATTTGGCCAAACATGGTATGAGAAACTCCTTGTrGGTTGCACCAA 
TGCCTACTGCTTCCACATCACAAATTT 

TCGGTAACAATGAATGirTTGAACCATACACTTCTAACATTTACTCTAGAAGAG 
TATTAGCTGGAGAATTCCAAA1TGTC 

AATCCATATTTATrGAAGGACTTGGTrGATTTGGGTGTCTGGAACGACGCTATG 
AAAAGTAGTA1TATTGCTAACAATGG 

TTCTATCCAAGCCTTACCAAACATCCCTGATGAAATCAAGGCATTGTACAAAA 
CTGTCTGGGAAATCTCACAAAAACATA 

TTATCGACATGGCTGCTGATAGAGCAGCATTTATTGATCAATCTCAATCATrAA 
ACATTCACATCAAAGATCCAACAATG 

GGTAAATTAACCAGTATGCACTTCTACGGTTGGAAGAAAGGnTAAAGACTGG 
TATGTACTACTTAAGAACACAAGCTGC 

CAGTOCTGCTATTCAATTTACCATTGATCAAAAGATrGCTGAGACTGCCGGTCA 
TACGGTTGCAAACTTGGACAAATTAA _ ^ ^ 



ACATTAAGAAATATGTTAACAAAGGAAGAGTTGAGAGTGAGAATACCAGTGAT 
GCTCCATACAAGTCACCATCAACCGAA 

CCAACCTCATTAGAAAGTTCAGTTGCTGATTTGAAAATAAAAGATGAAGGTGA 
AAAGCCAGCTGAAGACAAAACCATTGA 

AGAACTCGAAAATGACATTTATAGTGCCAAAGTTATCGCATGTGCTATTGATA 
ATCCAGAATCTTGTACAATGTGTTCTG -r- /- \ 



>RPL16 (485cL) 759bp in-house: 1-759 

GGAGGTNTCNOTCTCTGATTCTrCTCCCCTGCTCCACNCAAGGGCCAACCAACA 
ATGAGTCAAGTCGCTCCAAAGTGGTA 

CCAATCAGAAAGACGTTCCAGCTNCAAAACAAACCAGAAAAGACTGCTCGTCC 
ACAAAAATTACGTGCCTCnTAGTCCC 

AGGTACCGTTTTAATTrTATTGGCCGGTAGATTCAGAGGTAAAAGAGTrGTTTA 
CnTGAAGAACTTGGAAGACAACACCT 

TATTGGTTTCTGGTCCAITCAAAGTCAATGGTGTrCCATTGAGAAGAGTrAACG 
CTAGATACGTTATCGCCACCTCCACC 

AAAGTCAACGTTrCTGGTGTrGATGTTTCTAAATTCAACGTCGAATACTITGCT 
AGAGAAAAATCTTCTAAATCTAAAAA 

ATCCGAAGCTGAATTCTTCAATGAATCTCAACCAAAGAAAGAAATCAAAGCTG 
AAAGAG1TGCTGACCAAAAATCTGTCG 

ATGCTGCnTATTAAGTGAAATCAAAAAGACCCCATTATTGAAACAATACTTG 
GCCGCTTCAITCTCTrTGAAGAACGGT 

GACAGACCACACTTGTTAAAATTITAATrrAGGTGAAATrAATATnTGCAAAC 
ATGTTC ATGATAAATAACAATGNGGG 

CI I'l 11 AA AGCAATGGGATGGGGATATGGTTAAGAGGGATGGCnTATATnTG 
AGTTnTATATATGGGGACCTTTGGT 

TTAATAAATGGAANGNTATTGGGCTTCAAAATGAACTTN TTn CO 



>RPS21 (328c3)391bp in-house: 1-391 

AACATTAAAGCAAGATGGAAAACGATAAAGGTCAATTAGTTGAATTATACGTC 
CCAAGAAAATG1TCTGCTACCAACAGA 

ATCATTAAAGCCAAAOATCACGCTTCTGTTCAAATCTCAATTGCTAAAGTTGAT 
GAAGACGGTAGAGCTATTGCTGGTGA 

AAACATCACTTACGCTTTAAGTGGTTACGTTAGAGGTAGAGGTGAAGCTGATG 
ACTCATTAAACAGATTGGCTCAACAAG 

ACGGTTTATTGAAGAACGTCTGGTCTTACTCTCGTTAAGAGAATAGAAGAATA 
GACAAAAITGATAATTGGGTATnTAA 

GAAATTACTITrmATATTGCAAATTAATmAATCrTTCTTCTGTGTATA^ 
ATGNCTTAACATAAT 



GT 




>RVS167 (67gl) parti 733bp in-house: 145-733 public: 1-144 



TCTACTTCTGCTTGAGCACTGACCAri l l riCi iCATCTITAACAGTTCTTTCTTT 

CTTCAOTTCATATTTAGAAAAATT 

TCTCNTATGACGATCCAAATCCAATTGTTTATGGTCTCTmCACTGACATnTC 
CTTATAGCrrGTATAATCTTCAATA 

ATTCTTGTGCTGGTTCAACAATTCTTTTTTCAATCAATTCCAAATCGGGTITTAA 
GGTATCnTGAGATCTTTAACCACT 

GCTTGGTACAGTTCCGATGCTTCAATACCTTGTGGGTTATCTTCTGGTACCGTA 
GCACTGGGGTCCGATAATCTACCACT 

GATTGGTTTATAAATCTCAGCCACGGCTTTGGCAAAATCAATTTGTTCATCTAA 
CATCCCATTGACAGCATTGAA ATAT T 

TCTTGGATTCTTCACTCAACTTTmGTTTCCRTTTCGATTTCmGAATOT 
TCAGCATCGAGATAAACAGCATCT 

TGGGTGATTTCTCCCATGTTGAATITCTGACGCATTGTCTGTGGGGCCCTAAGG 
ACACCCnnTGAATCCnTAAATGA 

CATAAGATTGTAATAATTGAAAAAATAAAAGGAGAAGGAAGAAGGAAGGGAG 

ATAGTATATGAAAAGGAAGGGGCGGAGG 

GAATTAATTGTAGGAAGAAGTGGCATTGCTTTITGTCGAAAGCATmTrGAGC 
GTGCGAGAAATTTAATCCAAAAAAAT SV /g. ) 

GTGTGGTGAAAGG ^(j ^ 



>RVS167 (67gl) part2 1079bp public: 1-523/668-1079 PathoSeq: 524-667 

AGTGGTAGATTATCGGACCCCAGTGCTACGGTACCAGAAGATACCCTACAAGG 
TATTGAAGCATCGGAACTGTACCAAGC 

AGTGGTTANAGATCCTATAGATACTTTAAACCCCGATTTGGAATTGATTGAAAA 
AAGAATTGTTGAACCAGCACAAGAAT 

TATTGAAGATTATACAAGCTATAAGGAAAATGTCAGTGAAAAGAGACCATAAA 
CAATTGGAITTGGATCGTCATAAGAGA 

AATTnrCTAAATATGAACTGAAGAAAGAAAGAACTGTTAAAGATGAAGAAAA 
AATGTrCAGTGCTCAAGCAGAAGTAGA 

AATrGCTCAACAAGAGTACGATTATTATAATGATTTGTTAAAGAATGAATTGCC 

AGTnTGTATCAAATGCAAAGTGATT 

TTATCAAACCATrGTAIGTATCATrCTATrACATGCAGTrGAATATTTTCTACAC 
ATTATACACTAGAATGGAAGAGTTG 

AAAATTCCATATTTTATnTGTCTACTGATATTGTCGATGCNTATACTGCCAAG 
AAGGGGAACATTGAGGAACAAACCGA 

TTCTATTGGAATCACTCATTTCAAAGTCGGGCATGCCAAATCCAAATTGGAAGC 
CACTAAAAGAAGACATGCTGCTATGA 

AATAGTCCACCTCCTACNGGTGCCAAGCTCTATGGCATCTACAGGAACTGGTG 
GTGAATTACCTGCATACTCCCCAGGAG 

GTTACAACCAACCATATGGTGATAGCAAGTATCAACCACCATCTTCTCCAGCA 
ACATACCAATCTCCAGTAGTAGCAGCC 

ACTGCTCAATCTCCAGCTACTTATCAATCGCCAGTGGCTACTGGACAACCTCCA 

TCATATTTACCACAAACTCCAGCCAG 

TGCTCCACCACCACAAGTTGGTAGTGGCCTTCCAACATGCACGGCnTATACGA 

TTATACTGCACAAGCCCAGGGTGACT ru 





TGACnTCCCTGCAGGAGCTGTTATTGAAATTATACAAAGAACCGAAGATGCC 
AACGGATGGTGGACTGGTAAATACAAT . . i 

GGTCAAACCGGTGTGTTCCCTGGTAATTATGTGCAATTA 'f^c^ SlCk) 
>SAM2 (36) 1 155bp in-house: l-l 155 

ATGACTACTrCCAAGGAAACTTrCCTnTCACTTCAGAATCCGTTGGTGAAGGT 
CACCCAGATA AGATT TGTGACCAAGT 

CTCCGATGCCATTITAGATGCTTGTITAGCTGTTGATCCATTGTCAAAAGTTGCT 
TGTGAAAC TGCTGC CAAAACCGGTA 

TGATTATGGTmTGGTGAAATTACCACTAAAGCTCAAITGGATrATCAAAAAA 
TCATTAGAGACACCATTAAACACATT 

GGTTACGACGATrCTGAAAAAGGTnTGATTACAAGACTrGTAACGTCTTGGTT 
GCAATTGAACAACAATCTCCAGATAT 

TGCTCAAGGTITACATrACGAAAAAGCnTGGAAGAGTrGGGTGCTGGTGATC 
AAGGTATTATGTTTGGTTATGCCACCG 

ATGAAACCGATGAAAAATTGCCATTGACCATnTATrGGCCCACAAATTGAAT 
GCTGCCTrGGCTTCTGCCAGAAGATCA 

GGTTCCTTGCCATGGTTGAGACCAGATACCAAAACCCAAGTCACCATCGAGTA 
TGAAAAAGATGGTGGTGCAGTTATCCC 

AAAAAGAGTCGACACAAirGTrATITCCACTCAACATGCCGAAGAAATCACCA 
CCGAAAAnTGAGAAAAGAAATTATTG 

AACATATCATCAAGCAAGTCATCCCAGAACAnTATTAGACGACAAAACTATC 
TACCACATTCAGCCATCAGGCAGATTC 

GTCATTGGTGGTCCCCAAGGTGATGCTGGnTGACTGGTAGAAAGATCATTGTT 
GACACCTATGGTGGTTGGGGTGCACA 

TGGTGGTGGTGCCTrCTCAGGCAAGGATrTCTCCAAAGTTGATAGGTCTGCTGC 
TTATGCCGCTCGGTGGGTTGCTAAGT 

CGTTGGTGACCGCCGGATTGGCCAAAAGGGCCTTGGTGCAGTTCTCCTATGCTA 
TTGGGGTTGCTGAACCCACCAGCATr 

TATATAGACACCTATGGGACATCTAAATTGAGCACCGAAGCCCrTGTAGAAAT 
TATCAAGAATAATTTTGACTTACGCCC 

TGGCGTAATTGTAAAAGAATTAGAmGGCTCGTCCTATTTATnTAAAACCGC 
TTCTTACGGACATnTACTAACCAAG ^ 

AAAATTCTTGGGAACAACCAAAAAAATTAAAAnT S"/ 
>SAP(232c_cp)619bp 

AACCTATAATnTCAGAAAGAGACTAGATTCTGATAGAAATATAOACGCATCA 
CTATAnTTGGAAATATAGATCCACAA 

GTTACGGAGTTGTTAATGTATGAGTTGTTCATCCAATTTGGTCCCGTCAAATCA 
ATCAATATGCCAAAGGATCGTATATT 

GAAAACACACCAGGGGTATGGAnTGTCGAAnTAAAAACTCAGCAGATGCCA 
AATATACTATGGAAATACTACGAGGAA 

TAAGACnTATGCAAAAGCATTGAAATTGAAACGAAITGATGCCAAGTCTCAG 
TCATCAACAAACAACCCAAATAATCAA 



ACAATAGGAACATTTGTACAATCAOATTTGATCAATCCAAATTACATAGATGTT 
GGAGCTAAACTATTTATCAACAATCT 

TAATCCATTGGTCGATGAATCCTTTTTAATGGATACGTTTAGTAAGTTTGGAAC 
CCTTATAAGAAACCCAATAATTAGAC 

GTGATTCAGAGGGACACTCmGGGATACGGATTTCTTACGTACGATGACTTTG 
AAAGTAGTGATTTATGCATACAAAAA 

ATGAACAACACGATTTTGATGAATACCAAAATTGCTATCAGTTATGCATTCAA 



>SHA3 (83c3) 1376bp in-house: 375-1376 PathoSeq: 1-374 

TGNCCTGGAAATCCCCCATTACCATnTAAAGGTACCACCACCCCCCCAAANCT 
TNGCGACTATCCATCCAGGTATTANC 

CCTTGGAGGATTNGCCCATAATAATATGGATGGATCATTTGGAGCAAGGAiGAT 

TTGTCCACTAATATCATGGATAGACAA 

ATATCCACCAANAATAGTCATAGAAAAGTTCCAAGAACAGATTTTGAANCCCA 
ATTATTAATGAAGAATGCCATGTTACA 

ATTGATAGAAGCCATTGAATATTGTCACGAAAATAATATTTACCATTGTGATTT 
AAAACCAGAAAACATTATGGTTAGAT 

ATAATCCATACTATGTTCGTCCAACTATCAATAACAATAATAACAATGGAGAA 

GATGATTTATGCTATGCCAACAGTATT 

ATTGACTATAATGAATTACACCTCGTGTrGATTGATnTGG'nTAGCTATGGAC 
TCTGCTACCAnTGTTGTAATTCATG 

TCGTGGATCGTCATnTACATGGCACCAGAAAGAACCACCAATTATAACACCC 
ATCGTTTAATCAACCAATTAATTGATA 

TGAATCAATATGAGTCAATTGAAATCAATGGGACAACAGTGACAAAATCAAAC 
TGTAAATATTTACCTACATTGGCTGGG 

GATATTTGGTCATrGGGAGTATTGTTCATrAATATCACrrGTTCAAGAAACCCA 

TGGCCCATTGCATCATITGATAATAA 

TCAAAATAATGAAGTGTTTAAGAATTATATGTTGAATAATAACAAGGCTGTTTT 

GAGCAAAATCTTACCCATTTCCTCAC 

AATTTAATCGCTTATTAGATAGAATnTCAAATTGAATCCTAATGATAGAATAG 
ATTTACCAACnTATACAAAGAAOTT 

ATTCGTTGTGATTTCTTCAAAGATGATCATTACTACTATGCCCAACATCAACAT 
CATCACAATCACAATCAAATCAATAA 

TGCTTACAATCACTATCAGAAACAACCTAATCAAGCAAGACCTACTGCAAACC 
AACAATTGTATACACCACCGGAAACCA 

CCACTTATAATTCATACGCTAGTGATATGGAAGAAGATGAAATTAGTGATGAT 

GAGTITTATrCTGATGAAGAAGATGAA 

GATATTGAAGACTATGAAGAGGAAGAGGAAGAGTATTTTGGTAATGAGCAAC 

AACAACAACAGCAAGTCACAACAGTGAA 

TGGTAATTTTGGTCAAGTTAAAGGTACCTGTTATTACGATACCAAAACCAAAA 
CAACTACATATATAAAACCACCAGCTG 

CATATACTTTAGAGACGCCTAGTCAAAGTGTTGAATACTGTTAAGTTGTACACA 
TAAATAATTAATGACAATTAATAATA ^ 



NGGATC 




ACGATTAATAATATAG 




>TPI1 (233c_cp2) 636bp 

AACCAATmAGAAAC.WGGCTCGTCAATTTTTCGTAGGTGGTAACTrCAAAG 
CTAACGGTACCAAACA-ACAAATCACT 

TCAATCATCGACAACTTGAACAAGGCTGATTTACCAAAGGATGTCGAAGITGT 
CATTTGTCCACCCGCCCTTTACCTTGG 

TTTAGCTGTTGAGCAA.^CAAACAACCAACTGTTGCCATTGGTGCTCAAAATG 
TTTTTGACAAGTCATGTGGTGCTTTCA 

CTGGTGAAACCTGTGCTTCTCAAATCTTGGATGTTGGTGCCAGCTGGACTTTAA 
CTGGTCACAGTGAAAG.^GAACCATT 

ATNAAAGAATCCGATG.4ATTCATTGCTGAAAAAACCAAGTTTGCCTTGGACAC 
TGGTGTCAAAGTTATTNTATGTATTGG 

TGAAACCTTAGAGGAA.'VGAAAAGGTNGTGTCACTTTGGATGTTTGNGCCAGAC 
AATTGGGATGCTGGTTCCAAGATTGNN 

rrGATTGGTCAAACATTGNTGGhWCTTACGAACCTGTTTTGGNCAATTGGGTCT 
GGTTTANCCCGNTNCCCCANAAGATG 

CTGAAGAAACCTACAAGGTNTTAGACTCATTTGGNCAAGANCATTNGTGNCNA 
ACAACTGAAAAACCNGANTNTNG 
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1 TTCCATCGGG GAAAGTGGGG GGGAAAAAAT TTTAAGCAGT TCACAAAACC 
AAGGTAGCCC CTTTCACCCC CCCTTTTTTA AAATTCGTCA AGTGTTTTGG 



51 TTCCAAAAAA TATATGGACA AAGATGATTG TATTTTCCCG ACACCAAAAT 
AAGGTTTTTT ATATACCTGT TTCTACTAAC ATAAAAGGGC TCTCGTTTTA 



101 CATAATTAAT TATGAGAAAG TTAAATGTAA CGTTACAATT TATOTTTATT 
GTATTAATTA ATACTCTTTC AATTTACATT GCAATGTTAA ATACAAATAA 



151 TGAAGGTGAA AAGCGATTTA TGATTTTTCC GAAATGAAAA TTTTTTTTAG 
ACrrcCACTT TTCGCTAAAT ACTAAAAAGG CTTTACTTTT AAAAAAAATC 



201 GTTTATTTTT TTTGTCGGGC AAAGAAAAAC TGAACAAGGA TTATTAAAAT 
CAAATAAAAA AAACAGCCCG TTTCTTTTTG ACTTGTTCCT AATAATTTTA 



EcoRI 



251 TTTTGGTGTT TGTTTGTGTC TGGAGAATTC ATTCCTCTCT CATCTTCACA 
AAAACCACAA ACAAACACAG ACCTCTTAAG TAAGGAGAGA GTAGAAGTGT 



301 CAATGTTTAG ACATCTGACA CGATTCATGA TAGTTCGGTT TCCGGGGTIG 
GTTACAAATC TGTAGACTGT GCTAAGTACT ATCAAGCCAA AGGCCCCAAC 



351 GTGTTTAGTT TTCGTTTTTC TTTTTTTTTG GAAAGAATGT TTTAGCTCAT 
CACAAATCAA AAGCAAAAAG AAAAAAAAAC CTTTCTTACA AAATCGAGTA 



401 TGGTTTTCTT TCTTCATTCA ATAGTTTTGA AAGAATTTGC CCACTTGTTA 
ACCAAAAGAA AGAAGTAAGT TATCAAAACT TTCTTAAACG GGTGAACAAT 



451 TTACAATCAT ATAAAATTAA ACTTTGATAT AAAATAGAGT TTGAAAGTTT 
AATGTTAGTA TATTTTAATT TGAAACTATA TTTTATCTCA AACTTTCAAA 



501 CCCAGATCCT TTTTGATTTC TTTGTAAATT TTTTTTTCTC CCACATATAC 
GGGTCTAGGA AAAACTAAAG AAACATTTAA AAAAAAAGAG GGTGTATAIG 



PstI 



551 ACACATACAA ACCGATTTTT ATAAGAAAGA GTTATACCCT GCAGCTCGAC 
TGTGTATGTT TGGCTAAAAA TATTCTTTCT CAATATGGQA CGTCGAGCTC 



PstI Hindlll Aval 



601 CTCGACTGTT TAAACCTGCA GGCATGCAAG CTTGGCCAAA AAGGCCTCGA 
GAGCTGACAA ATTTGGACGT CCGTACGTTC GAACCGGTTT TTCCXX3AGCT 



Aval 

651 GGAACATGAC CAACAAGTGT CTCCTCCAAA TTGCTCTCCT GTTGTGCTTC 
CCTTGTACTG GTTGTTCACA GAGGAGGTTT AACGAGAGGA CAACACGAAG 



701 TCCACTACAG CTCTTTCCAT GAGCTACAAC TTGCTTGGAT TCCTACAAAG 
AGGTGATGTC GAGAAAGGTA CTCGATGTTG AACGAACCTA AGGATGTTTC 



751 AAGCAGCAAT TTTCAGTGTC AGAAGCTCCT GTGGCAATTG AATGGGAGGC 
TTCGTCGTTA AAAGTCACAG TCTTCGAGGA CACCGTTAAC TTACCC1CCG 



801 TTGAATACTG CCTCAAGGAC AGGATGAACT TTGACATCCC TGAGGAGATT 
AACTTATGAC GGAGTTCCTG TCCTACTTGA AACTGTAGGG ACTCCTCTAA 
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99 



PstI 



851 AAGCAGCTGC AGCAGTTCCA GAAGGAGGAC GCCGCATTGA CCATCTATGA 
TTCGTCGACG TCGTCAAGGT CTTCCTCCTG CGGCGTAACT GGTAGATACT 



901 GATGCTCCAG AACATCTTTG CTATTTTCAG ACAAGATTCA TCTAGCACTG 
CTACGAGGTC TTGTAGAAAC GATAAAAGTC TGTTCTAAGT AGATCGTGAC 



951 GCTGGAATGA GACTATTGTT GAGAACCTCC TGGCTAATGT CTATCATCAG 
CGACCTTACT CTGATAACAA CTCTTGGAGG ACCGATTACA GATAGTAGTC 



1001 ATAAACCATC TGAAGACAGT CCTGGAAGAA AAACTGGAGA AAGAAGATTT 
TATTTGGTAG ACTTCTGTCA GGACCTTCTT TTTGACCTCT TTCTTCTAAA 



1051 CACCAGGGGA AAACTCATGA GCAGTCTGCA CCTGAAAAGA TATTATGGGA 
GTGGTCCCCT TTTGAGTACT CGTCAGACGT GGACTTTTCT ATAATACCCT 



1101 GGATTCTGCA TTACCTGAAG GCCAAGGAGT ACAGTCACTG TGCCTGGACC 
CCTAAGACGT AATGGACTTC CGGTTCCTCA TGTCAGTGAC ACGGACCTGG 



1151 ATAGTCAGAG TGGAAATCCT AAGGAACTTT TACTTCATTA ACAGACTTAC 
TATCAGTCTC ACCTTTAGGA TTCCTTGAAA ATGAAGTAAT TGTCTGAATG 



1201 AGGTTACCTC CGAAACTGAA GATCTCCTAG CCTGTGCCTC TGGGACTGGA 
TCCAATGGAG GCTTTGACTT CTAGAGGATC GGACACGGAG ACCCTGACCT 



1251 CAATTGCTTC AAGCATTCTT CAACCAGCAG ATGCTGTTTA AGTGACTGAT 
GTTAACGAAG TTCGTAAGAA GTTGGTCGTC TACGACAAAT TCACTGACTA 



1301 GGCTAATGTA CTGCATATGA AAGGACACTA GAAGATTTTG AAATTTTTAT 
CCGATTACAT GACGTATACT TTCCTGTGAT CTTCTAAAAC TTTAAAAATA 



1351 TAAATTATGA GTTATTTTTA TTTATTTAAA TTTTATTTTG GAAAATAAAT 
ATTTAATACT CAATAAAAAT AAATAAATTT AAAATAAAAC CTTTTATTTA 



Smal 
BamHI 

Aval Aval 

1401 TATTTTTGGT GCAAAAGTCC CTCGAGGCCT AGCGGCCGCC TAGAGGATCC 
ATAAAAACCA CGTTTTCAGG GAGCTCCGGA TCGCCGGCGG ATCTCCTAGG 



Xmal 



Smal 



Aval 



1451 CCGGGCGCTA GGCGGCCGCT AGGCCmTT GGCCAAGCTC GAATTTCGAG 
GGCCCGCGAT CCGCCGGCGA TCCGGAAAAA CCGGTTCGAG CTTAAAGCTC 



Xmal 



Smal 



EcoRI Aval Clal 



1501 GAATTCGAGC TCGGTACCCG GGGGATCGAT CCGTCCCCCT TTTCCTTTGT 
CTTAAGCTCG AGCCATGGGC CCCCTAGCTA GGCAGGGGGA AAAGGAAACA 
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1551 CGATATCATG TAATTAGTTA TGTCACGCTT ACATTCACGC CCTCCCCCCA 
GCTATAGTAC ATTAATCAAT ACAGTGCGAA TGTAAGTGCG GGAGGGGGGT 

1601 CATCCGCTCT AACCGAAAAG GAAGGAGTTA GACAACCTGA AGTCTAGGTC 
GTAGGCGAGA TTGGCTTTTC CTTCCTCAAT CTGTTGGACT TCAGATCCAG 



1651 CCTATTTATT TTTTTATAGT TATGTTAGTA TTAAGAACGT TATTTATATT 
GGATAAATAA AAAAATATCA ATACAATCAT AATTCTTGCA ATAAATATAA 



1701 TCAAATTTTT CTTTTTTTTC TGTACAGACG CGTGTACGCA TGTAACAm 
AGTTTAAAAA GAAAAAAAAG ACATGTCTGC GCACATGCGT ACATTGTAAT 



1751 TACTGAAAAC CTTGCTTGAG AAGGTTTTGG GACGCTCGAA GGCTTTAATT 
ATGACTTTTG GAACGAACTC TTCCAAAACC CTGCGAGCTT CCGAAATTAA 

1801 TGCAAGCTAG CTTGGCGTAA TCATGGTCAT AGCTGTTTCC TCTGTGAAAT 
ACGTTCGATC GAACCGCATT AGTACCAGTA TCGACAAAGG ACACACTTTA 

1851 TGTTATCCGC TCACAATTCC ACACAACATA CGAGCCGGAA GCATAAAGTC 
ACAATAGGCG AGTGTTAAGG TGTGTTGTAT GCTCGGCCTT CGTATTTCAC 



1901 TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA ACTCACATTA ATTGCX3TTGC 
ATTTCGGACC CCACGGATTA CTCACTCGAT TGAGTGTAAT TAACGCAACG 



1951 GCTCACTGCC CGCTTTCCAG TCGGGAAACC TGTCGTGCCA GAGATCTCTG 
CGAGTGACGG GCGAAAGGTC AGCCCTTTGG ACAGCACGGT CTCTAGAGAC 



2001 CATTAATGAA TCGGCCAACG CGCGGGGAGA GGCGGTTTCC GTATTGGGCG 
GTAATTACTT AGCCGGTTGC GCGCCCCTCT CCGCCAAACG CATAACCCGC 



2051 CTCTTCCGCT TCCTCGCTCA CTGACTCGCT GCGCTCGGTC GTTCGGCTGC 
GAGAAGGCGA AGGAGCGAG? GACTGAGCGA CGCGAGCCAG CAAGCCGACG 



Clal 



2101 GGCGAGCGGT ATCAGATCGA TCTCACTCAA AGGCQGTAAT ACGGTTATCC 
CCGCTCGCCA TAGTCTAGCT AGAGTGAGTT TCCGCCATTA TGCCAATAGG 



2151 ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA AAGGCCAGCA 
TGTCTTAGTC CCCTATTGCG TCCTTTCTTG TACACTCGTT TTCCGGTCCTT 



2201 AAAGGCCAGG AACCGTAAAA AGGCCGCXSTT GCTGGCGTTT TTCCATAGGC 
TTTCCGGTCC TTGGCATTTT TCCGGCGCAA CGACCGCAAA AAGGTATCCG 



2251 TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCICAAG TCAGAGGTGG 
AGGCGGGGGG ACTGCTCGTA GTGTTTTTAG CTGCGAGTTC AGTCTCCAOC 

2301 CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC CTGGAAGCTC 
GCTTTGGGCT GTCCTGATAT TTCTATGGTC CGCAAAGGGG GACXTTTCGAG 



2351 CCTCGTGCGC TCTCCTGTTC CGACCCTGCC GCTTACCGGA TACCTCTCCG 
GGAGCACGCG AGAGGACAAG GCTGGGACGG CGAATGGCCT ATGGACAGGC 



2401 CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCATAGCTC ACGCTGTAGG 
GGAAAGAGGG AAGCCCTTCG CACCGCGAAA GAGTATCGAG TCCGACATO: 



ApaLI 



2451 TATCTCAGTT CGGTGTAGGT CGTTCGCTCC AAGCTGGGCT GTGTGCACGA 
ATAGAGTCAA GCCACATCCA GCAAGCGAGG TTCGACCCGA CACACGTGCT 
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2501 ACCCCCCGTT CAGCCCGACC GCTGCGCCTT ATCCGGTAAC TATCGTCTPG 
TGGGGGGCAA GTCGGGCTGG CGACGCGGAA TAGGCCATTG ATAGCAGAAC 



2551 AGTCCAACCC GGTAAGACAC GACTTATCXX: CACTGGCAGC AGCCACTGGT 
TCAGGTTGGG CCATTCTGTG CTGAATAGCG GTGACCGTCG TCGGTGACCA 



2601 AACAGGATTA GCAGAGCGAG GTATGTAGGC GGTGCTACAG AGTTCTTGAA 
TTGTCCTAAT CGTCTCGCTC CATACATCCG CCACGATGTC TCAAGAACTT 



2651 GTGGTGGCCT AACTACGGC7 ACACTAGAAG GACAGTATTT GGTATCTGCG 
CACCACCGGA TTGATGCCGA TGTGATCTTC CTGTCATAAA CCATAGACX3C 



2701 CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG CTCTTGATCC 
GAGACGACTT CGGTCAATGG AAGCCTTTTT CTCAACCATC GAGAACTAGG 



2751 GGCAAACAAA CCACCGCTGG TAGCGGTGGT TTTTTTGTTT GCAAGCAGCA 
CCGTTTGTTT GGTGGCGACC ATCGCCACCA AAAAAACAAA CGTTCGTCGT 



2801 GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG ATCTTTTCTA 
CTAATGCGCG TCTTTTTTTC CTAGAGTTCT TCTAGGAAAC TAGAAAAGAT 



2851 CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG GATTTTGGTC 
GCCCCAGACT GCGAGTCACC TTGCTTTTGA GTGCAATTCC CTAAAACCAG 



2901 ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA ATTAAAAATG 
TACTCTAATA GTTTTTCCTA GAAGTGGATC TAGGAAAATT TAATTTTTAC 



2951 AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 
TTCAAAATTT AGTTAGATTT CATATATACT CATTTGAACC AGACTGTCAA 



3001 ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG TCTATTTCGT 
TGGTTACGAA TTAGTCACTC CGTGGATAGA GTCGCTAGAC AGATAAAGCA 



3051 TCATCCATAG TTGCCTGAC7 CCCCGTCGTG TAGATAACTA CGATACGGGA 
AGTAGGTATC AACGGACTGA GGGGCAGCAC ATCTATTGAT GCTATGCCCT 



3101 GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA GACCCACGCT 
CCCGAATGGT AGACCGGGG7 CACGACGTTA CTATGGCGCT CTGGGTGCGA 



3151 CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG AAGGGCCGAG 
GTGGCCXSAGG TCTAAATAGT CGTTATTTGG TCGGTCGGCC TTCCCGGCTC 



3201 CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT CTATTAATTG 
GCGTCTTCAC CAGGACGTTG AAATAGGCX3G AGGTAGGTCA GATAATTAAC 



3251 TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT TTGCGCAACG 
AACGGCCCTT CGATCTCATT CATCAAGCGG TCAATTATCA AACGCGTTGC 



3301 TTGTTGCCAT TGCTACAGGC ATCGTGGTGT CACX3CTCGTC GTTTGGTATG 
AACAACGGTA ACGATGTCCG TAGCACCACA GTGCGAGCAG CAAACCATAC 



3351 GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA CATGATCCCC 
CGAAGTAAGT CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT GTACTAGGGG 



3401 CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG ATCGTTGTCA 
GTACAACACG TTTTTTCGCC AATCGAGGAA GCCAGGAGGC TAGCAACAGT 



3451 GAAGTAAGTT GGCCGCAGTG TTATCACTCA TGGTTATGGC AGCACTGCAT 
CTTCATTCAA CCGGCGTCAC AATAGTGAGT ACCAATACCG TCGTGACGTA 
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3501 AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG TGACTGGTGA 
TTAAGAGAAT GACAGTACGG TAGGCATTCT ACGAAAAGAC ACTGACCACT 



3551 GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA CCGAGTTCCT 
CATGAGTTGG TTCAGTAAGA CTCTTATCAC ATACGCCGCT GGCTCAACGA 



3601 CTTGCCCGGC GTCAATACGG GATAATACCG CGCCACATAG CAGAACTTTA 
GAACGGGCCG CAGTTATGCC CTATTATGGC GCGGTGTATC GTCTTGAAAT 



3651 AAAGTGCTCA TCATTGGAAA ACGTrcrTCG GGGCGAAAAC TCTCAAGGAT 
TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCTT TTG AGAGTTCCTA 



ApaLI 



3701 CTTACCGCTG TTGAGATCCA GTTCGATGTA ACCCACTCGT GCACCCAACT 
GAATGGCGAC AACTCTAGGT CAAGCTACAT TGGGTGAGCA CGTCGGTTGA 



3751 GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG AGCAAAAACA 
CTAGAAGTCG TAGAAAATGA AAGTGGTCGC AAAGACCCAC TCGTTTTTGT 



3801 GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC GGAAATGTTC 
CCTTCCGTTT TACGGCX3TTT TTTCCCTTAT TCCCGCTGTG CCTTTACAAC 



3851 AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT TATCAGGGTT 
TTATGAGTAT GAGAAGGAAA AAGTTATAAT AACTTCGTAA ATAGTCCCAA 



3901 ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA 
TAACAGAGTA CTCGCCTATG TATAAACTTA CATAAATCTT TTTATTTGTT 



3951 ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCTG ACGTCTAAGA 
TATCCCCAAG GCGCGTGTAA AGGGGCTTTT CACGGTGGAC TGCAGATTCT 



4001 AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT ATCACGAGGC 
TTGGTAATAA TAGTACTGTA ATTGGATATT TTTATCCGCA TAGIGCTCCG 



4051 CCTTTCGTCT CGCGCGTTTC GGTGATGACG GTGAAAACCT CTCACACATC 
GGAAAGCAGA GCGCGCAAAG CCACTACTGC CACTTTTGGA GACTGTGTAC 



4101 CAGCTCCCGG AGACGGTCAC AGCITGTCTG TAAGCGGATG CCGGGAGCAG 
GTCGAGGGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC GGCCCTCGTC 



4151 ACAAGCXrCGT CAGGGCGCGT CAGCGGGTGT TGGCGGGTGT CGGGGCTGGC 
TGTTCGGGCA GTCCCGCGCA GTCGCCCACA ACCGCCCACA GCCCCGACCG 



ApaLI 



4201 TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA CCATATCGAC 
AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT GGTATAGCTC 



4251 GCTCTCCCTT ATGCGACTCC TGCATTAGGA AGCAGCCCAG TAGTAGGTTG 
CGAGAGGGAA TACGCTGAGG ACGTAATCCT TCGTCX5GGTC ATCATCCAAC 



4301 AGGCCGTTGA GCACCX^CCGC CGCAAGGAAT GGTGCATGCA AGGAGATCGC 
TCCGGCAACT CGTGGCGGCG GCGTTCCTTA CCACGTACGT TCCTCTACCG 



4351 GCCCAACAGT CCCCCGGCCA CGGGGCCTGC CACCATACCC ACGCCGAAAC 
CGGGTTGTCA GGGGGCCGGT GCCCCGGACG GTGGTATGGG TGCGGCTTTG 



4401 AAGCACTAAT AGGAATTGAT TTGGATGGTA TAAACGGAAA CAAAAAAAAG 
TTCGTGATTA TCCTTAACTA AACCTACCAT ATTTGCCTTT GTTTTTTTTC 



4 
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4451 AGCTGGTACT ACTTTCTTTA AAATTATTTT ATTATTTGAT TTTATTTAAT 
TCGACCATGA TGAAAGAAAT TTTAATAAAA TAATAAACTA AAATAAATTA 



4501 AGTATATATT ATATTTTGAA CGTAGATTAT TTTGTTGAAA GTTGCTGTAG 
TCATATATAA TATAAAACTT GCATCTAATA AAACAACTTT CAACGACATC 



4551 TGCCATTGAT TCGTAACACT AATTCTGTAT TAGTCATTCC TCTT G TTTGA 
ACGGTAACTA AGCATTGTGA TTAAGACATA ATCAGTAAGG AGAACAAACT 



4601 TAGTATCCAA AAAAACGGCT ATTTTTTTGC AATCTTATTT CCTGCATATT 
ATCATAGGTT TTTTTGCCGA TAAAAAAACG TTAGAATAAA GGACGTATAA 



4651 ATACAGATAA CATAATGAAA GAAAAAATCT TTTTTTTTGT TCTTCAATGA 
TATGTCTATT GTATTACTTT CTTTTTTAGA AAAAAAAACA AGAAGTTACT 



4701 TGATTTCAAC CATTCTTTTA AACATTGATC AATTCCTGAG CAACAACCCC 
ACTAAAGTTG GTAAGAAAAT TTGTAACTAG TTAAGGACTC GTTGTTGGGG 



4751 ATACACACTG GTTTATATAC CGCCCCTTTT ACAGTTGAAG AAAGAAATAG 
TATGTGTGAC CAAATATATG GCGGGGAAAA TGTCAACTTC TTTCTTTATC 



4801 AAATAGAAAT AGCAAACAAA AGATATGACA GTCAACACTA AGACCTATAG 
TTTATCTTTA TCGTTTGTTT TCTATACTGT CAGTTGTGAT TCTGGATATC 



4851 TGAGAGAGCA GAAACTCATG CCTCACCAGT AGCACAGCGA TTATTTCGAT 
ACTCTCTCGT CTTTGAGTAC GGAGTGGTCA TCGTGTCGCT AATAAAGCTA 



4901 TAATGGAACT GAAGAAAACC AATTTATGTG CATCAATTGA CGTTGATACC 
ATTACCTTGA CTTCTTTTGG TTAAATACAC GTAGTTAACT GCAACTATGG 



Aval 



4951 ACTAAGGAGT TCCTCGAGTT AATTGATAAA TTAGGTCCTT ATGTATGCTT 
TGATTCCTCA AGGAGCTCAA TTAACTATTT AATCCAGGAA TACATACGAA 



5001 AATCAAGACT CATATTGATA TAATCAATGA TTTTTCCTAT GAATCCACTA 
TTAGTTCTGA GTATAACTAT ATTAGTTACT AAAAAGGATA CTTAGGTGAT 



5051 TTGAACCATT ATTAGAACTT TCACGTAAAC ATCAATTTAT GATTTTTGAA 
AACTTGGTAA TAATCTTGAA AGTGCATTTG TAGTTAAATA CTAAAAACTT 



5101 GATAGAAAAT TTGCTGATAT TGGTAATACC GTAAAGAAAC AATATATTGG 
CTATCTTTTA AACGACTATA ACCATTATGG CATTTCTTTG TTATATAACC 



5151 TGGAGTTTAT AAAATTAGTA GTTGGGCAGA TATTACCAAT GCTCATGGTG 
ACCTCAAATA TTTTAATCAT CAACCCGTCT ATAATGGTTA CGAGTACXAC 



5201 TCACTGGGAA TGGAGTGGTT GAAGGATTAA AACAGGGAGC TAAAGAAACC 
AGTGACCCTT ACCTCACCAA CTTCCTAATT TTGTCCCTCG ATTTCTTTGG 



5251 ACCACCAACC AAGAGCCAAG AGGGTTATTG ATGTTAGCTG AATTATCATC 
TGGTGGTTGG TTCTCGGTTC TCCCAATAAC TACAATCGAC TTAATAGTAG 



5301 AGTGGGATCA TTAGCATATG GAGAATATTC TCAAAAAACT GTTGAAATTG 
TCACCCTAGT AATCGTATAC CTCTTATAAG AGTTTTTTGA CAACTTTAAC 



5351 CTAAATCCGA TAAGGAATTT GTTATTGGAT TTATTGCCCA ACGTGATATG 
GATTTAGGCT ATTCCTTAAA CAATAACCTA AATAACGGGT TGCACTATAC 
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5401 GGTGGCCAAG AAGAAGGATT TGATTGGCTT ATTATGACAC CTGGAGTTGG 
CCACCGGTTC TTCTTCCTAA ACTAACCGAA TAATACTGTG GACCTCAACC 



5451 ATTAGATGAT AAAGGTGATG GATTAGGACA ACAATATAGA ACTGTTGATG 
TAATCTACTA TTTCCACTAC CTAATCCTGT TGTTATATCT TGACAACTAC 



5501 AAGTTGTTAG CACTGGAACT GATATTATCA TTGTTGGTAG AGGATTGTTT 
TTCAACAATC GTGACCTTGA CTATAATAGT AACAACCATC TCCTAACAAA 



5551 GGTAAAGGAA GAGATCCAGA TATTGAAGGT AAAAGGTATA GAAATGCTGG 
CCATTTCCTT CTCTAGGTCT ATAACTTCCA TTTTCCATAT CTTTACGACC 



5601 TTGGAATGCT TATTTGAAAA AGACTGGCCA ATTATAAATG TGAAGGGGGA 
AACCTTACGA ATAAACTTTT TCTGACCGGT TAATATTTAC ACTTCCCCCT 



5651 GATTTTCACT TTATTAGATT TGTATATATG TAGAATAAAT AAATAAATAA 
CTAAAAGTGA AATAATCTAA ACATATATAC ATCTTATTTA TTTATTTATT 



5701 GTTAAATAAA TAATTAAATA AGGGTGGTAA TTATTACTAT TTACAATCAA 
CAATTTATTT ATTAATTTAT TCCCACCATT AATAATGATA AATGTTAGTT 



5751 AGGTGGTCCT TCTAGCTGTA ATCCGGGCAG CGCAACGGAA CATTCATCAG 
TCCACCAGGA AGATCGACAT TAGGCCCGTC GCGTTGCCTT 6TAAGTAGTC 



5801 TGTAAAAATG GAATCAATAA AGCCCTGCGC TCATGAGCCC GAAGTGGCGA 
ACATTTTTAC CTTAGTTATT TCGGGACGCG AGTACTCGGG CTTCACCGCT 



5851 GCCCGATCTT CCCCATCGGT GATGTCGGCG ATATAGGCGC CAGCAACCGC 
CGGGCTAGAA GGGGTAGCCA CTACAGCCGC TATATCCGCXS GTCGTTGGCG 



5901 ACCTGTGGCG CCGCAGCGCG CAGGGTCAGC CTGAATACGC GTTTAATGAC 
TGGACACCGC GGCGTCGCGC GTCCCAGTCG GACTTATGCG CAAATTACTG 



5951 CAGCACAGTC GTGATGGCAA GGTCAGAATA GCCCAAGTCG GCCGAGGGGC 
GTCGTGTCAG CACTACCGTT CCAGTCTTAT CGGGTTCAGC CGGCTCCCCG 



6001 CTGTACAGTG AGGGAAGATC TGATATTGAC GAAGAGGAAC CAATCTAACG 
GACATGTCAC TCCCTTCTAG ACTATAACTG CTTCTCCTTG GTTACATTGC 



6051 TTACACTGAA GAAAACACAC AATAAACGGG AAGAAACGGT GTAAAAGTGT 
AATGTGACTT CTTTTGTGTG TTATTTGCCC TTCTTTGCCA CATTTTCACA 



6101 GAAAATAATT TTTGAATATC ATTTCCCTTG GTTTAATTCC AAACGAAACG 
CTTTTATTAA AAACTTATAG TAAAGGGAAC CAAATTAAGG TTTGCTTTGC 



EcoRI 



6151 TGTTTTTTTT AGAGAATGGG AATTCTTATT GGATGTCTAG ATTGTTTGTT 
ACAAAAAAAA TCTCTTACCC TTAAGAATAA CCTACAGATC TAACAAACAA 



ApaLX 



6201 TACTCCAGAC TGTGCACAAA AACGTTTGGA TGGATGATCA GAAGATATTT 
ATGAGGTCTG ACACGTGTTT TTGCAAACCT ACCTACTAGT CTTCTATAAA 



6251 TTAGGCTTAG CTCTAAATAT AAGAAATGAT GCTTGAAAAA CCAGACAGAA 
AATCCGAATC GAGATTTATA TTCTTTACTA CGAACTTTTT GGTCTGTCTT 



6301 ATTGAGTTTC AAAAATTGGT AATGTGAGGT ATTAGTCAAC TAACCAAATA 
TAACTCAAAG TTTTTAACCA TTACACTCCA TAATCAGTTG ATTGGTTTAT 
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6351 ACAATGCAAA CCGGTTGATA CATTTCATTT TGAAAATAAT GAAACTGGAA 
TCTTACGTTT GGCCAACTAT GTAAAGTAAA ACTTTTATTA CTTTC3ACCTT 



6401 TTGGATGACC AGCACACAAA CACATAAAGT AATTATGGGA ATTAGAAGCG 
AACCTACTGG TCGTGTGTTT GTGTATTTCA TTAATACCCT TAATCTTCGC 



6451 AACATAGAGG AGTACTTGGC CACGAACAGA ATACAAGTGG GAACACTATT 
TTGTATCTCC TCATGAACCG GTGCTTGTCT TATGTTCACC CTTGTGATAA 



6501 TTCTCCATTG TTTTAGTTCT GTTTTTTTGT CAGCCTAGTT TTGTGCTATG 
AAGAGGTAAC AAAATCAAGA CAAAAAAACA GTCGGATCAA AACACGATAC 



Hindi II 

6551 TGTAAAAAAT ATTGCCAAGA AAAAAAGCTT GTITTGTGGC CAGTGTCCGA 
ACATTTTTTA TAACGGTTCT TTTTTrCGAA CAAAACACCG GTCACAGGCT 



6601 AAAAAATTTT GGGGAATCTT CGGATTAATT TATGTTTTCA 
TTTTTTAAAA CCCCTTAGAA GCCTAATTAA ATACAAAAGT 
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Hindi II 

1 AGCTTGAGTA TTCTATAGTG TCACCTAAAT AGCTTGGCGT AATCATGGTC 
TCGAACTCAT AAGATATCAC AGTGGATTTA TCGAACCGCA TTAGTACCAG 



51 ATAGCTGTTT CCTGTGTGAA ATTGTTATCC GCTCACAATT CCACACAACA 
TATCGACAAA GGACACACTT TAACAATAGG CGAGTGTTAA GGTGTGTTGT 



101 TACGAGCCGG AAGCATAAAG TGTAAAGCCT GGGGTGCCTA ATGAGTGAGC 
ATGCTCGGCC TTCGTATTTC ACATTTCGGA CCCCACGGAT TACTCACTCG 



151 TAACTCACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 
ATTGAGTGTA ATTAACGCAA CGCGAGTGAC GGGCGAAAGG TCAGCCCTTT 



201 CCTGTCGTGC CAGCTGCATT AATGAATCX3G CCAACGCGCG GGGAGAGGCG 
GGACAGCACG GTCGACGTAA TTACTTAGCC GGTTGCGCGC CCCTCTCCXX: 



251 GTTTGCGTAT TGGGCGCTCT TCCGCTTCCT CGCTCACTGA CTCGCTGCGC 
CAAACGCATA ACCCGCGAGA AGGCGAAGGA GCGAGTGACT GAGCGACGCG 



301 TCGGTCGTTC GGCTGCGGCG AGCGGTATCA GCTCACTCAA AGGCGGTAAT 
AGCCAGCAAG CCGACGCCGC TCGCCATAGT CGAGTGAGTT TCCGCCATTA 



351 ACGGTTATCC ACAGAATCAG GGGATAACGC AGGAAAGAAC ATGTGAGCAA 
TGCCAATAGG TGTCTTAGTC CCCTATTGCG TCCTTTCTTG TACACTCGTT 



401 AAGGCCAGCA AAAGGCCAGG AACCGTAAAA AGGCCGCGTT GCTGGCX3TTT 
TTCCGGTCGT TTTCCGGTCC TTGGCATTTT TCCGGCGCAA CGACCGCAAA 



451 TTCCATAGGC TCCGCCCCCC TGACGAGCAT CACAAAAATC GACGCTCAAG 
AAGGTATCCG AGGCGGGGGG ACTGCTCGTA GTGTTTTTAG CTGCGAGTTC 



501 


TCAGAGGTGG 
AGTCTCCACC 


CGAAACCCGA CAGGACTATA AAGATACCAG GCGTTTCCCC 
GCTTTGGGCT GTCCTGATAT TTCTATGGTC CX3CAAAGGGG 


551 


CTGGAAGCTC 
GACCTTCGAG 


CCTCGTGCGC 
GGAGCACGCG 


TCTCCTGTTC CGACCCTGCC GCTTACCGGA 
AGAGGACAAG GCTGGGACGG CGAATGGCCT 


601 


TACCTGTCCG 
ATGGACAGGC 


CCTTTCTCCC 
GGAAAGAGGG 


TTCGGGAAGC GTGGCGCTTT CTCATAGCTC 
AAGCCCTTCG CACCGCGAAA GAGTATCGAG 


651 


ACGCTGTAGG 
TGCGACATCC 


TATCTCAGTT 
ATAGAGTCAA 


CGGTGTAGGT CXJTTCGCTCC AAGCTGGGCT 
GCCACATCCA GCAAGCGAGG TTCGACCCGA 




ApaLI 






701 


GTGTGCACGA 
CACACGTGCT 


ACCCCCCGTT 
TGGGGGGCAA 


CAGCCCGACC GCTGCGCCTT ATCCGGTAAC 
GTCGGGCTGG CGACGCGGAA TAGGCCATTG 


751 


TATCGTCTTG 
ATAGCAGAAC 


AGTCCAACCC 
TCAGGTTGGG 


GGTAAGACAC GACTTATCGC CACTGGCAGC 
CCATTCTGTG CTGAATAGCG GTGACCGTCG 


80.1 


AGCCACTGGT 
TCGGTGACCA 


AACAGGATTA 
TTGTCCTAAT 


GCAGAGCGAG GTATGTAGGC GGTGCTACAG 
CGTCTCGCTC CATACATCCG CCACGATGTC 


851 


AGTTCTTGAA 
TCAAGAACTT 


GTGGTGGCCr 
CACCACCGGA 


AACTACGGCT ACACTAGAAG GACAGTATTT 
TTGATGCCGA TGTGATCTTC CTGTCATAAA 


901 


GGTATCTGCG CTCTGCTGAA GCCAGTTACC TTCGGAAAAA GAGTTGGTAG 
CCATAGACGC GAGACGACTT CGGTCAATGG AAGCCTTTTT CTCAACCATC 
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951 CTCTTGATCC GGCAAACAAA CCACCGCTCG TAGCGGTGGT TTTTTTGm 
GAGAACTAGG CCGTTTGTTT GGTGGCGACC ATCGCCACCA AAAAAACAAA 



1001 GCAAGCAGCA GATTACGCGC AGAAAAAAAG GATCTCAAGA AGATCCTTTG 
CGTTCGTCGT CTAATGCGCG TCTTTTTTTC CTAGAGTTCT TCTAGGAAAC 



1051 ATCTTTTCTA CGGGGTCTGA CGCTCAGTGG AACGAAAACT CACGTTAAGG 
TAGAAAAGAT GCCCCAGACT GCGAGTCACC TTGCTTTTGA GTGCAATTCC 



1101 GATTTTGGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 
CTAAAACCAG TACTCTAATA GTTTTTCCTA GAAGTGGATC TAGGAAAATT 



1151 ATTAAAAATG AAGTTTTAAA TCAATCTAAA GTATATATGA GTAAACTTGG 
TAATTTTTAC TTCAAAATTT AGTTAGATTT CATATATACT CATTTGAACC 



1201 TCTGACAGTT ACCAATGCTT AATCAGTGAG GCACCTATCT CAGCGATCTG 
AGACTGTCAA TGGTTACGAA TTAGTCACTC CGTGGATAGA GTCGCTAGAC 



1251 TCTATTTCGT TCATCCATAG TTGCCTGACT CCCCGTCGTG TAGATAACTA 
AGATAAAGCA AGTAGGTATC AACGGACTGA GGGGCAGCAC ATCTATTCAT 



1301 CGATACGGGA GGGCTTACCA TCTGGCCCCA GTGCTGCAAT GATACCGCGA 
GCTATGCCCT CCCGAATGGT AGACCGGGGT CACGACGTTA CTATGGCGCT 



1351 GACCCACGCT CACCGGCTCC AGATTTATCA GCAATAAACC AGCCAGCCGG 
CTGGGTGCGA GTGGCCGAGG TCTAAATAGT CGTTATTTGG TCGGTCGGCC 



1401 AAGGGCCGAG CGCAGAAGTG GTCCTGCAAC TTTATCCGCC TCCATCCAGT 
TTCCCGGCTC GCGTCTTCAC CAGGACGTTG AAATAGGCGG AGGTAGGTCA 



1451 CTATTAATTG TTGCCGGGAA GCTAGAGTAA GTAGTTCGCC AGTTAATAGT 
GATAATTAAC AACGGCCCTT CGATCTCATT CATCAAGCGG TCAATTATCA 



1501 TTGCGCAACG TTGTTGCCAT TGCTACAQGC ATCGTGGTGT CACGCTCGTC 
AACGCGTTGC AACAACGGTA ACGATGTCCG TAGCACCACA GTGCGAGCAG 



1551 GTTTGGTATG GCTTCATTCA GCTCCGGTTC CCAACGATCA AGGCGAGTTA 
CAAACCATAC CGAAGTAAGT CGAGGCCAAG GGTTGCTAGT TCCGCTCAAT 



1601 CATGATCCCC CATGTTGTGC AAAAAAGCGG TTAGCTCCTT CGGTCCTCCG 
GTACTAGGGG GTACAACACG TTTTTTCGCC AATCGAGGAA GCCAGGAGGC 



1651 ATCGTTGTCA GAAGTAAGTT C3GCCGCAGTG TTATCACTCA TGGTTATGGC 
TAGCAACAGT CTTCATTCAA CCGGCX5TCAC AATAGTGAGT ACCAATACCX3 



1701 AGCACTGCAT AATTCTCTTA CTGTCATGCC ATCCGTAAGA TGCTTTTCTG 
TCGTGACGTA TTAAGAGAAT GACAGTACGG TAGGCATTCT ACGAAAAGAC 



1751 TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGA 
ACTGACCACT CATGAGTTGG TTCAGTAAGA CTCTTATCAC ATACGCCGCT 



1801 CCGAGTTGCT CTTGCCCGGC GTCAATACGG GATAATACCG CX3CCACATAG 
GGCTCAACGA GAACGGGCCG CAGTTATGCC CTATTATGGC GCGGTGTATC 



1851 CAGAACTTTA AAAGTGCTCA TCATTGGAAA ACGTTCTTCG GGGCGAAAAC 
GTCTTGAAAT TTTCACGAGT AGTAACCTTT TGCAAGAAGC CCCGCmTC 
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1901 TCTCAAGGAT CTTACCGCTG TTGAGATCCA GTTOSATGTA ACCCACTCGT 
AGAGTTCCTA GAATGGCGAC AACTCTAGGT CAAGCTACAT TGGGTGAGCA 



ApaLI 

1951 GCACCCAACT GATCTTCAGC ATCTTTTACT TTCACCAGCG TTTCTGGGTG 
CXSTGGGTTCA CTAGAAGTCG TAGAAAATGA AAGTGGTCGC AAAGACCCAC 



2001 AGCAAAAACA GGAAGGCAAA ATGCCGCAAA AAAGGGAATA AGGGCGACAC 
TCGTTTTTGT CCTTCCGTTT TACGGCGTTT TTTCCCTTAT TCCCGCTGTG 



2051 GGAAATGTTG AATACTCATA CTCTTCCTTT TTCAATATTA TTGAAGCATT 
CCTTTACAAC TTATGAGTAT GAGAAGGAAA AAGTTATAAT AACTTCGTAA 



2101 TATCAGGGTT ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA 
ATAGTCCCAA TAACAGAGTA CTCX3CCTATG TATAAACTTA CATAAATCTT 



2151 AAATAAACAA ATAGGGGTTC CGCGCACATT TCCCCGAAAA GTGCCACCIG 
TTTATTTGTT TATCCCCAAG GCGCGTGTAA AGGGGCTTTT CACGGTGGAC 



2201 ACGTCTAAGA AACCATTATT ATCATGACAT TAACCTATAA AAATAGGCGT 
TGCAGATTCT TTGGTAATAA TAGTACTGTA ATTGGATATT TTTATCCGCA 



2251 ATCACGAGGC CCTTTCGTCT CGCGCGTTTC GGTGATGACG GTGAAAACCT 
TAGTGCTCCG GGAAAGCAGA GCGCGCAAAG CCACTACTGC CACTTTTGGA 



2301 CTGACACATG CAGCTCCCGG AGACGGTCAC AGCTTGTCTG TAAGCGGATG 
GACTGTGTAC GTCGAGGGCC TCTGCCAGTG TCGAACAGAC ATTCGCCTAC 



2351 CCGGGAGCAG ACAAGCCCGT CAGGGCGCGT CAGCGGGTGT TGGCGGGTGT 
GGCCCTCGTC TGTTCGGGCA GTCCCGCGCA GTCGCCCACA ACCGCCCACA 



2401 CGGGGCTGGC TTAACTATGC GGCATCAGAG CAGATTGTAC TGAGAGTGCA 
GCCCCGACCG AATTGATACG CCGTAGTCTC GTCTAACATG ACTCTCACGT 



2451 CCATATGCGG TGTGAAATAC CGCACAGATG CGTAAGGAGA AAATACCGCA 
GGTATACGCC ACACTTTATG GCGTGTCTAC GCATTCCTCT TTTATGGCGT 



2501 TCAGGCGAAA TTGTAAACGT TAATATTTTG TTAAAATTCG CGTTAAATAT 
AGTCCGCTTT AACATTTGCA ATTATAAAAC AATTTTAAGC GCAATTTATA 



2551 TTGTTAAATC AGCTCATTTT TTAACCAATA GGCCGAAATC GGCAAAATCC 
AACAATTTAG TCGAGTAAAA AATTGGTTAT CCGGCTTTAG CCGTTTTAGG 



2601 CTTATAAATC AAAAGAATAG ACCGAGATAG GGTTGAGTGT TGTTCCAGTT 
GAATATTTAG TTTTCTTATC TGGCTCTATC CCAACTCACA ACAAGGTCAA 



2651 TGGAACAAGA GTCCACTATT AAAGAACGTG GACTCCAACG TCAAAGGGCG 
ACCTTGTTCT CAGGTGATAA TTTCTTGCAC CTGAQGTTGC AGTTTCCCGC 



2701 AAAAACCGTC TATCAGGGCG ATGGCCCACT ACGTGAACCA TCACCCAAAT 
TTTTTGGCAG ATAGTCCCGC TACCGGGTGA TGCACTTGGT AGTGGGTTTA 



2751 CAAGTTTTTT GCGGTCGAGG TGCCGTAAAG CTCTAAATCG GAACCCTAAA 
GTTCAAAAAA CGCCAGCTCC ACGGCATTTC GAGATTTAGC CTTGGGATTT 
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2801 GGGAGCCCCC GATTTAGAGC TTGACGGGGA AAGCCGGCGA ACGTGGCGAG 
CCCTCGGGGG CTAAATCTCG AACTGCCCCT TTCGGCCGCT TGCACCGCTC 



2851 AAAGGAAGGG AAGAAAGCGA AAGGAGCGGG CGCTAGGGCG CTGGCAAGTG 
TTTCCTTCCC TTCTTTCGCT TTCCTCGCCC GCGATCCCGC GACCGTTCAC 



2901 TAGCGGTCAC GCTGCGCGTA ACCACCACAC CCGCCGCGCT TAATGCGCCG 
ATCGCCAGTG CGACGCGCA? TGGTGGTGTG GGCGGCGCGA ATTACGCGGC 



2951 CTACAGGGCG CGTCCATTCG CCATTCAGGC TGCGCAACTG TTGGGAAGGG 
GATGTCCCGC GCAGGTAAGC GGTAAGTCCG ACGCGTTGAC AACCCTTCCC 



3001 CGATCGGTGC GGGCCTCTTC GCTATTACGC CAGCTGGCGA AAGGGGGATG 
GCTAGCCACG CCCGGAGAAG CGATAATGCG GTCGACCGCT TTCCCCCTAC 



3051 TGCTGCAAGG CGATTAAGTT GGGTAACGCC AGGGTTTTCC CAGTCACGAC 
ACGACGTTCC GCTAATTCAA CCCATTGOGG TCCCAAAAGG GTCAGTGCTG 



3101 GTTGTAAAAC GACGGCCAGT GAATTGTAAT ACGACTCACT ATAGGGCGAA 
CAACATTTTG CTGCCGGTCA CTTAACATTA TGCTGAGTGA TATCCCGCTT 



3151 TTGGTTTTCC AATGATGAGC ACTTTTAAAG TTCTGCTATG TGGCGCGGTA 
AACCAAAAGG TTACTACTCG TGAAAATTTC AAGACXSATAC ACCGCGCCAT 



3201 TTATCCCGTG TTGACGCCGG GCAAGAGCAA CTCGGTCGCC GCATACACTA 
AATAGGGCAC AACTGCGGCC CGTTCTCGTT GAGCCAGCGG CGTATGTGAT 



3251 TTCTCAGAAT GACTTGGTTG AGTACTAATA GGAATTGATT TGGATGGTAT 
AAGAGTCTTA CTGAACCAAC TCATGATTAT CCTTAACTAA ACCTACCATA 



3301 AAACGGAAAC AAAAAAAAGA GCTGGTACTA CTTTCTTTAA AATTATTTTA 
TTTGCCTTTG TTTTTTTTCT CGACCATGAT GAAAGAAATT TTAATAAAAT 



3351 TTATTTGATT TTATTTAATA GTATATATTA TATTTTGAAC GTAGATTATT 
AATAAACTAA AATAAATTAT CATATATAAT ATAAAACTTG CATCTAATAA 



3401 TTGTTGAAAG TTGCTGTAGT GCCATTGATT CGTAACACTA ATTCTGTATT 
AACAACTTTC AACGACATCA CGGTAACTAA GCATTGTGAT TAAGACATAA 



3451 AGTCATTCCT CTTGTTTGA7 AGTATCCAAA AAAACGGCTA TTTTTTTGCA 
TCAGTAAGGA GAACAAACTA TCATAGGTTT TTTTGCCGAT AAAAAAACGT 



3501 ATCTTATTTC CTGCATATTA TACAGATAAC ATAATGAAAG AAAAAATCTT 
TAGAATAAAG GACGTATAAT ATGTCTATTG TATTACTTTC TTTTTTAGAA 



3551 TTTTTTTGTT CTTCAATGAT GATTTCAACC ATTCTTTTAA ACATTGATCA 
AAAAAAACAA GAAGTTACTA CTAAAGTTGG TAAGAAAATT TGTAACTAGT 



3601 ATTCCTGAGC AACAACCCCA TACACACTGG TTTATATACC GCCCCTTTTA 
TAAGGACTCG TTGTTGGGGT ATGTGTGACC AAATATATGG CGGGGAAAAT 



3651 CAGTTGAAGA AAGAAATAGA AATAGAAATA GCAAACAAAA GATATGACAG 
GTCAACTTCT TTCTTTATCT TTATCTTTAT CGTTTGTTTT CTATACTGTC 



3701 TCAACACTAA GACCTATAGT GAGAGAGCAG AAACTCATGC CTCACCAGTA 
AGTTGTGATT CTGGATATCA CTCTCTCGTC TTTGAGTACG GAGTGGTCAT 



3751 GCACAGCGAT TATTTCGATT AATGGAACTG AAGAAAACCA ATTTATGTGC 
CGTGTCGCTA ATAAAGCTAA TTACCTTGAC TTCTTTTGGT TAAATACACG 
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EcoRI 



3801 ATCAATTGAC GTTGATACCA CTAAGGAATT CCTTGAATTA ATTGATAAAT 
TAGTTAACTG CAACTATGGT GATTCCTTAA GGAACTTAAT TAACTATTTA 



3851 TAGGTCCTTA TGTATGCTTA ATCAAGACTC ATATTGATAT AATCAATGAT 
ATCCAGGAAT ACATACGAAT TAGTTCTGAG TATAACTATA TTAGTTACTA 



3901 TTTTCCTATG AATCCACTAT TGAACCATTA TTAGAACTTT CACGTAAACA 
AAAAGGATAC TTAGGTGATA ACTTGGTAAT AATCTTGAAA GTGCATTTGT 



3951 TCAATTTATG ATTTTTGAAG ATAGAAAATT TGCTGATATT GGTAATACCG 
AGTTAAATAC TAAAAACTTC TATCTTTTAA ACGACTATAA CCATTATGGC 



4001 TAAAGAAACA ATATATTGGT GGAGTTTATA AAATTAGTAG TTGGGCAGAT 
ATTTCTTTGT TATATAACCA CCTCAAATAT TTTAATCATC AACCCGTCTA 



4051 ATTACCAATG CTCATGGTGT CACTGGGAAT GGAGTGGTTG AAGGATTAAA 
TAATGGTTAC GAGTACCACA GTGACCCTTA CCTCACCAAC TTCCTAATTT 



4101 ACAGGGAGCT AAAGAAACCA CCACCAACCA AGAGCX^VAGA GGGTTATTGA 
TGTCCCTCGA TTTCTTTGGT GGTGGTTGGT TCTCGGTTCT CCCAATAACT 



4151 TGTTAGCTGA ATTATCATCA GTGGGATCAT TAGCATATGG AGAATATTCT 
ACAATCGACT TAATAGTAGT CACCCTAGTA ATCGTATACC TCTTATAAGA 



4201 CAAAAAACTG TTGAAATTGC TAAATCCGAT AAGGAATTTG TTATTGGATT 
GTTTTTTGAC AACTTTAACG ATTTAGGCTA TTCCTTAAAC AATAACCTAA 



42 5i TATTGCCCAA CGTGATATGG GTGGCCAAGA AGAAGGATTT GATTGGCTTA 
ATAACGGGTT GCACTATACC CACCGGTTCT TCTTCCTAAA CTAACCGAAT 



4301 TTATGACACC TGGAGTTGGA TTAGATGATA AAGGTGATGG ATTAGGACAA 
AATACTGTGG ACCTCAACCT AATCTACTAT TTCCACTACC TAATCCTGTT 



4351 CAATATAGAA CTGTTGATGA AGTTGTTAGC ACTGGAACTG ATATTATCAT 
GTTATATCTT GACAACTACT TCAACAATCG TGACCTTGAC TATAATAGTA 



4401 TGriGGTAGA GGATTGTTTG GTAAAGGAAG AGATCCAGAT ATTGAAGGTA 
ACAACCATCT CCTAACAAAC CATTTCCTTC TCTAGGTCTA TAACTTCCAT 



4451 AAAGGTATAG AAATGCTGGT TGGAATGCTT ATTTGAAAAA GACTGGCCAA 
TTTCCATATC TTTACGACCA ACCTTACGAA TAAACTTTTT CTGACCX3GTT 



4501 TTATAAATGT GAAGGGGGAG ATTTTCACTT TATTAGATTT GTATATATGT 
AATATTTACA CTTCCCCCTC TAAAAGTGAA ATAATCTAAA CATATATACA 



4551 AGAATAAATA AATAAATAAG TTAAATAAAT AATTAAATAA GGGTGGTAAT 
TCTTATTTAT TTATTTATTC AATTTATTTA TTAATTTATT CCCACCATTA 



4601 TATTACTATT TACAATCAAA GGTGGTCCTT CTAGCTGTAA TCCX3GGCAGC 
ATAATGATAA ATGTTAGTTT CCACCAGGAA GATCGACATT AGGCCCGTCG 



4651 GCAACGGAAC ATTCATCAGT GTAAAAATGG AATCAATAAA GCCCTGCGCA 
CGTTGCCTTG TAAGTAGTCA CATTTTTACC TTAGTTATTT CGGGACGCGT 



4701 GCGCGCAGGG TCAGCCTGAA TACGCGTTTA ATGACCAGCA CAGTCGTGAT 
CGCGCXTTCCC AGTCGGACTT ATGCGCAAAT TACTGGTCGT GTCAGCACTA 



06/07/93 16:24:29 pGALlPNiST-1 



4751 GGCAAGGTCA GAATAGCCCA AGTCGGCCGA GGGGCCTGTA CAGTGAGGGA 
CCGTTCCAGT CTTATCGGGT TCAGCCGGCT CCCCGGACAT GTCACTCCCT 



4801 AGATCTGATA TTGACGAAGA GGAACCAATG TAACGTTACA CTGAAGAAAA 
TCTAGACTAT AACTGCTTCT CCTTGGTTAC ATTGCAATGT GACTTCTTTT 



4851 CACACAATAA ACGGGAAGAA ACGGTGTAAA AGTGTGAAAA TAATTTTTCA 
GTGTGTTATT TGCCCTTCTT TGCCACATTT TCACACTTTT ATTAAAAACT 



4901 ATATCATTTC CCTTGGTTTA ATTCCAAACG AAACGTGTTT TTTTTAGAGA 
TATAGTAAAG GGAACCAAAT TAAGGTTTGC TTTGCACAAA AAAAATCTCT 



EcoRI ApaLI 



4951 ATGGGAATTC TTATTGGATG TCTAGATTGT TTGTTTACTC CAGACTGTGC 
TACCCTTAAG AATAACCTAC AGATCTAACA AACAAATGAG GTCTGACACG 



ApaLI 

5001 ACAAAAACGT TTGGATGGAT GATCAGAAGA TATTTTTAGG CTTAGCTCTA 
TGTTTTTGCA AACCTACCTA CTAGTCTTCT ATAAAAATCC GAATCGAGAT 



5051 AATATAAGAA ATGATGCTTG AAAAACCAGA CAGAAATTGA GTTTCAAAAA 
TTATATTCTT TACTACGAAC TTTTTGGTCT GTCTTTAACT CAAAGTTTTT 



5101 TTGGTAATGT GAGGTATTAG TCAACTAACC AAATAACAAT GCAAACCGGT 
AACCATTACA CTCCATAATC AGTTGATTGG TTTATTGTTA CGTTTGGCCA 



5151 TGATACATTT CATTTTGAAA ATAATGAAAC TGGAATTGGA TGACCAGCAC 
ACTATGTAAA GTAAAACTTT TATTACTTTG ACCTTAACCT ACTGGTCGTG 



5201 ACAAACACAT AAAGTAATTA TGGGAATTAG AAGCGAACAT AGAGGAGTAC 
TGTTTGTGTA TTTCATTAAT ACCCTTAATC TTCGCTTGTA TCTCCTCATG 



5251 TTGGCCACGA ACAGAATACA AGTGGGAACA CTATTTTCTC CATTGTTTTA 
AACCGGTGCT TGTCTTATGT TCACCCTTGT GATAAAAGAG GTAACAAAAT 



5301 GTTCTGTTTT TTTGTCAGCC TAGTTTTGTG CTATGTGTAA AAAATATTGC 
CAAGACAAAA AAACAGTCGG ATCAAAACAC GATACACATT TTTTATAACG 



Hindlll 



5351 CAAGAAAAAA AGCTTGTTTT GTGGCCAGTG TCCGAAAAAA ATTTTGGGGA 
GTTCTTTTTT TCGAACAAAA CACCGGTCAC AGGCTTTTTT TAAAACCCCT 



5401 ATCTTCGGAT TAATTTATGT TTTCATTCCA TCGGGGAAAG TGGGGGGGAA 
TAGAAGCCTA ATTAAATACA AAAGTAAGGT AGCCCCTTTC ACCCCCCCTT 



5451 AAAATTTTAA GCAGTTCACA AAACCTTCCA AAAAATATAT GGACAAAGAT 
TTTTAAAATT CGTCAAGTGT TTTGGAAGGT TTTTTATATA CCTGTTTCTA 



5501 GATTGTATTT TCCCGACACC AAAATCATAA TTAATTATQA GAAAGTTAAA 
CTAACATAAA AGGGCTGTGG TTTTAGTATT AATTAATACT CTTTCAATTT 



5551 TGTAACGTTA CAATTTATGT TTATTTGAAG GTGAAAAGCG ATTTATGATT 
ACATTGCAAT GTTAAATACA AATAAACTTC CACTTTTCGC TAAATACTAA 



5601 TTTCCGAAAT GAAAATTTTT TTTAGGTTTA TTTTTTTTGT CGGGCAAAGA 
AAAGGCTTTA CTTTTAAAAA AAATCCAAAT AAAAAAAACA GCCCGTTTCT 
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ECORI 



5651 AAAACTGAAC AAGGATTATT AAAATTTTTG GTGTTTGTTT GTGTCTGGAG 
TTTTGACTTG TTCCTAATAA TTTTAAAAAC CACAAACAAA CACAGACCTC 



EcoRI 



5701 AATTCATTCC TCTCTCATCT TCACACAATG TTTAGACATC TGACACGATT 
TTAAGTAAGG AGAGAGTAGA AGTGTGTTAC AAATCTGTAG ACTGTGCTAA 



5751 CATGATAGTT CGGTTTCCGG GGTTGGTGTT TAGTTTTCGT TTTTCTTTTT 
GTACTATCAA GCCAAAGGCC CCAACCACAA ATCAAAAGCA AAAAGAAAAA 



5801 TTTTGGAAAG AATGTTTTAG CTCATTGGTT TTCTTTCTTC ATTCAATAGT 
AAAACCTTTC TTACAAAATC GAGTAACCAA AAGAAAGAAG TAAGTTATCA 



5851 TTTGAAAGAA TTTGCCCACT TGTTATTACA ATCATATAAA ATTAAACTTT 
AAACTTTCTT AAACGGGTGA ACAATAATGT TAGTATATTT TAATTTGAAA 



5901 GATATAAAAT AGAGTTTGAA AGTTTCCCAG ATCCTTTTTG ATTTCTTTGT 
CTATATTTTA TCTCAAACTT TCAAAGGGTC TAGGAAAAAC TAAAGAAACA 



5951 AAATTTTTTT TTCTCCCACA TATACACACA TACAAACCGA TTTTTATAAG 
TTTAAAAAAA AAGAGGGTGT ATATGTGTGT ATGTTTGGCT AAAAATATTC 



PstI Aval BaznHI 



6001 AAAGAGTTAT ACCCTGCAGC TCGACCTCGA GGGATCCGGG CCCTCTAGAT 
TTTCTCAATA TGGGACGTCG AGCTGGAGCT CCCTAGGCCC GGGAGATCTA 



Aval 



6051 GCGGCCGCTA GGCCTCGAGG GACTTTTGCA CCAAAAATAA TTTATTTTCC 
CGCCGGCGAT CCGGAGCTCC CTGAAAACGT GGTTTTTATT AAATAAAAGG 



6101 AAAATAAAAT TTAAATAAAT AAAAATAACT CATAATTTAA TAAAAATTTC 
TTTTA1TTTA AATTTATTTA TTTTTATTGA GTATTAAATT ATTTTTAAAG 



6151 AAAATCTTCT AGTGTCCTT7 CATATGCAGT ACATTAGCCA TCAGTCACTT 
TTTTAGAAGA TCACAGGAAA GTATACGTCA TGTAATCGGT AGTCAGTGAA 



6201 AAACAGCATC TGCTGGTTGA AGAATGCTTG AAGCAATTGT CCAGTCCCAG 
TTTGTCGTAG ACGACCAACT TCTTACGAAC TTCGTTAACA GGTCAGGGTC 



6251 AGGCACAGGC TAGGAGATCT TCAGTTTCGG AGGTAACCTG TAAGTCTGTT 
TCCGTGTCCG ATCCTCTAGA AGTCAAAGCXT TCCATTGGAC ATTCAGACAA 



6301 AATGAAGTAA AAGTTCCTTA GGATTTCCAC TCTGACTATG GTCCAGGCAC 
TTACTTCATT TTCAAGGAAT CCTAAAGGTG AGACTGATAC CAGGTCCGTG 



6351 AGTGACTGTA CTCCTTGGCC TTCAGGTAAT GCAGAATCCT CCCATAATAT 
TCACTGACAT GAGGAACCGG AAGTCCATTA CGTCTTAGGA GGGTATTATA 



6401 CTTTTCAGGT GCAGACTGC7 CATGAGTTTT CCCCTGGTGA AATCTTCTTT 
GAAAAGTCCA CGTCTGACGA GTACTCAAAA GGGGACCACT TTAGAAGAAA 



6451 CTCCAGTTTT TCTTCCAGGA CTGTCTTCAG ATGGTTTATC TGATGATAGA 
GAGGTCAAAA AGAAGGTCC- GACAGAAGTC TACCAAATAG ACTACTATCT 



6501 CATTAGCCAG GAGGTTCTCA ACAATAGTCT CATTCCAGCC AGTGCTAGAT 
GTAATCGGTC CTCCAAGAG7 TGTTATCAGA GTAAGGTCGG TCACGATCTA 
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6551 GAATCTTGTC TGAAAATAGC AAAGATGTTC TGGAGCATCT CATAGATGGT 
CTTAGAACAG ACTTTTATCG TTTCTACAAG ACCTCGTAGA GTATCTACCA 



PstI 



6601 CAATGCGGCG TCCTCCTTC7 GGAACTGCTG CAGCTGCTTA ATCTCCTCAG 
GTTACGCCGC AGGAGGAAGA CCTTGAOGAC GTCGACGAAT TAGAGGAGTC 



6651 GGATGTCAAA GTTCATCCTG TCCTT6AGGC AGTATTCAAG CCTCCCATTC 
CCTACAGTTT CAAGTAGGAC AGGAACTCCG TCATAAGTTC GGAGGGTAAG 



6701 AATTGCCACA GGAGCTTCTG ACACTGAAAA TTGCTGCTTC TTTGTAGGAA 
TTAACGGTGT CCTCGAAGAC TGTGACTTTT AACGACX3AAG AAACATCCTT 



6751 TCCAAGCAAG TTGTAGCTCA TGGAAAGAGC TGTAGTGGAG AAGCACAACA 
AGGTTCGTTC AACATCGAGT ACCTTTCTCG ACATCACCTC TTCGTCTIGT 



Aval 



6801 GGAGAGCAAT TTGGAGGAGA CACTTGTTGG TCATGTTCCT CGAGGCCTTT 
CCTCTCGTTA AACCTCCTCT GTGAACAACC AGTACAAGGA GCTCCGGAAA 



BainHI 

6851 TTGGCCAGCT GGCGCCTGCT GCGCGACGGC GAGCTGCTCA CCACCCAGGA 
AACCGGTCGA CCGCGGACGA CGCGCTGCCG CTCGACGAGT GGTGGGTCCT 



BainHI 

6901 TCCGTCCCCC TTTTCCTTTG TCGATATCAT GTAATTAGTT ATGTCACGCT 
AGGCAGGGGG AAAAGGAAAC AGCTATAGTA CATTAATCAA TACAGTGCGA 



6951 TACATTCACG CCCTCCCCCC ACATCCGCTC TAACCGAAAA GGAAGGAGTT 
ATGTAAGTGC GGGAGGGGGG TGTAGGCGAG ATTGGCTTTT CCTTCCTCAA 



7001 AGACAACCTG AAGTCTAGGT CCCTATTTAT TTTTTTATAG TTATGTTAGT 
TCTGTTGGAC TTCAGATCCA GGGATAAATA AAAAAATATC AATACAATCA 



7051 ATTAAGAACG TTATTTATAT TTCAAATTTT TCTTTTTTTT CTGTACAGAC 
TAATTCTTGC AATAAATATA AAGTTTAAAA AGAAAAAAAA GACATGTCTG 



7101 GCGTGTACGC ATGTAACATT ATACTGAAAA CCTTGCTTGA GAAGGTTTTG 
CGCACATGCG TACATTGTAA TATGACTTTT GGAACGAACT CTTCCAAAAC 



Hindlll 

7151 GGACGCTCGA AGGCTTTAAT TTGCA 
CCTGCGAGCT TCCGAAATTA AACGT 



XX fs 

1 MSITV'TFPKS PSTKXRAPAF 01S.LZF3QQQ S3DGAI3KAA LAVPVFSVDN 
XXX R 
51 QDFVLIRDUV KYOTYPSSYQ Lr^/KLVKCM ISKSQIL5CTD XDLNKELFEL 
iOl DLISEADTKI CIFVISLPLV YSRISNKIO/P YVLRE?EQPK VSKA?TCE5CP 
151 ASWAA25DD SMLDDDESDE ^/D2DMDE0ND ^SGELSKGYK HMHKDKPKYI 
201 NDDRVTIGQV FHQYGLDPST NE^ISKINY^K NFGVSOYRFL 

251 PNSKLSVA2R ELVINANNYM DMHi:?EKTES KPKKSFRKPI GKSKXHNLQI 

fs T 

301 DFNSIDLSES ^;:?GC<2FIPD FSIKHLCK\^ NYY^/TSNHQS LELSFOTKNL 

X 

351 NATSNSSYLF ♦CCTV-CIKSKS IQKLVTNSDT ::NYHK?tC/FV TKTYRGPGSG 
401 t;r/CDGA:^K IMKIELSSKK KPFHKRftVSN Mv'RYNKSLKO L^/KEXFSKN? 
451 VSYLLSiQRK Y7I::YSNLEI LHNSLOFr/u UCYHGVAQB TWNNYYKFKIi 

X fs 
501 IDFEQLKALQ MHAMZIEERK ESEKjRQESU, RLVPEDEPUE 

XX L X * X XX 
551 FEQLQS3FGQ P.y.KDLEEKLK RR^^XiEASLSD SFEADSENDD ESELAQIQQD 

missing seq'jence 

601 rSSSANAtKT KFEAyJlICLI NPAPPPQPrE 7PQLDLNNKF SLPT^YPElt 

missing sequence 
651 PJCLPLELRGv' VPSSX22LPP IKKAIHYVT? YPERFK2CEYL TRKRDYPIAN 

missing 

==saBa 

701 a:jsgwxc- ^6 



117e af 



1 K5CEDEHKNH J/KCKr^-lCHl'^H VAPXP-lTAGQ iStN-NKlCTSK mLMM.VVSA 
axnbiguinies 

51 DDLAKVTf^S t:<K7?:KTK O 



1Sc1 



fs s 

1 QQSW?gSQP rrSQQTQD.^G MPSGGGGGHG HYQQCQGm yGP??PQOGV 

a2nbi9uitiea 
X w w wx 

51 VQQgPs5GW9 VYC'XCQQQ? M^r/QQ0?P5G G^niSCLXCSCL AALCVCCTLD 
aiob 
101 MLF 



XX X R X 

1 MFD^•?I1KNL VD2Ti:3EIDS GECELSDDVY yyvsyECTvjj^ EiDDSSErTAQ 
51 ILLSNSELG? XTPNFi^PFS fSINIEimvi SVIOTFKTKKP TTT^/JGTSTS 
X XX 
101 ALSTFESTIF EI?K7FVGSR RXQLSSPKNK NSTIKFDVFD V/IFSSGTTN2 
151 X^.HGLV1V5S GVLLGTCILF TL 



222aa 



£ £ E 

: MRRRErERRK iCr'JCREQRgiC EH£AI:RDIRI QQLSEQDSH5 NQTKXSEXVF 

5: KKAKSTNSGA DETGLMSDKE FDDSAYSPDY LP2EMLWNKP NHPITTNHKTK 

101 KYTENV/ERL I?;??NDTSAY M3SFHDETNI QNEIQIP2ND EYVPQMKATS 

K D VR fs C 

151 SVNNTT:PA5 K5.;-:ESLST$S NKJ\RXFETAD vgvxolds?x xaqtrniwki 
p 

201 QVSDNPW^rA• F7MXXKP.LE? PE3ICLCRDQ p/J 7^ 



32ac1 



r 

anbigult;.e& 

1 MPRIXCVDVF T>rriCYLGNPV A\^YDSOMLT TQSMQXIAHW TNLSETTFIL 
51 TPKSSIADYS :?.:F7S33N'E IiPFAGHPTW tafa:.l3:x3K IXPNKIQQII 
acnbiguities 

ICl QEC3AG1VKI rv'SKT?N?JNN NKr-miCSNEL PFL-LSFELPV rKFHEIDDKV 

fS ON 
151 lEELQKSWNO T>::ZG:<PV'LI DAOPKWAVFQ LGSCSKZVLOL NVDLAQIEKL 



airbiguiiies 0 G 
201 SwEtlCWTGIG VTGftKVEMGD S'/ELRi!IAPA VGV 



^aK parti 



XX XXXX fs 
1 Nv^TDSTPFAM GriaSTFYAV TSVGRSjCIY DLATLHLIP/ SQIV^PSRIT 
C XF X 

51 SLAAKHHYVY ASVGDRIGIF RRGRlEHSIiV CEGNSTVT-JQL LVFGSVLIAT 

X X XX X xxxx 

ICl TLEGDir/rS KTEGKXFPrs L^TTiaill-JS U'EGSIVGLI KPPTYLWICV: 

xx:-: X 

151 VkTI^QSVFVI in^TGKLLYX SHELQFSGEK ISSIEAA»VL DVIAVGTStJG 
fs 
X r 

501 NVFLF7;iKiCG iSWAKKLLL LEU^LrSKVA SISF?.TDGAP HLVAGWOfGD 

AIR 
a =: a 

251 LYFYDIOXXS RVhVLRWAKK ETKGGVA*VAK rLNGQPIVLS NGGDNHLK3P 

H S Y G R c 

301 VFD?NL':rrSN 55IVPPPRHL RSBGGHSAPF V-AIBrPQEDK THFLLSASKD 
£s 

V LC ambiguities 

351 KTFWrFSLRK DAQAQSMSQR LQKSKDGKSQ AGQWSMR2K FPEIISISSS 



aires 

401 YAK 



3?qK pirt2 



1 IITAHKDE77 ARTVrDSRNKR VGRHLLNTZP vSGIVKSVCVS ^CGNFG-VGS 

51 SIGGXGSYNL QSCtl,?SKrJ LHKQAVTGLA IDO^lKmVS CGLDGIVGrv 

101 DFGKSWLGK IQLHAFITSM lYHKLSDLVA CALDDLSr^*/ IDVrrQXVIR 

151 ILVOHTORIS GJIDFSPDGR'W IVS^/ALDSTL RTVt^:*?TCGC IDGVILPIVA 

:0i TA\TC7SPI0D ILACTHVSGN GVSLWTNF-AQ FKF'vSTRHV'E E02FSTILL? 

251 KASGDOGST:! LnGFLOEDSX EDGTIC-EQYT SAACIDASLI TL3SEPRSKF 

301 ^'TLLKLDTIK QQSKPKEAPK KPENA?FF1Q LTG 



7 



MqK Bart? 



1 KLRKLDTKGN XP^GiSGQF S7JI-TYLLNL SPAVLDLEIR 

51 SliNSFVPLTE MTNFIQALNA GIKSNAN^'ZI WETLVAMFFN XHGDVIKOFE 
iOi NET3LH2ALE tVR^^-i^EKi: raOSIVTvC ASIVSFIS 



XX XX X 

: JCLQKNRADLD XV-grCIWYC^.T RKLIO-DILKS S53SI3EQIK IRDPISRVYC 
fee X Xfs XG rFXXXE V S DPTX 

51 PF.ILDLVKAI GTDKi:EA:.KK KQLVSQuQiS IDNLLVQEVP RSKRVLGGAV 
-XF3XXX>:i;:Q'X X rxv loissing aeouence 

101 XSTFSTIFt^i KKELIiQHQVQ IHQNCDKELD QLRVLIARQK QXGELINAF/ 



jniaaing se«iuer-ce 

sBSAus asasssass=: = 5a*aBOMsatsseMa BSCS 
151 SSOTJEMtOK? I42SVD'/T3SK IKQARRRAKK IL 



NSX Mrs XX G X M R XFXV/ X 

== = = ■ - =13 a a = s a 

1 NLTLKrinn' :<i.^:sQi/^rtX shksibtlzk r-flosvNAVND sddl^stcEi 

X R GX X R X XX X 

51 IRR?TLEG^,'X SO"STSKDIT S^CKLI^^IPTrr ILRETRPDEK WEDyLIDVI 
^ * X R 

101 APQXwlQSSD VPDSWLIST PSIXGKILSI KDSRNtTANQl LLETRYGILL 
151 KDAKVF^/LNK E::IVGC?DMI. SISNPYGAKS f7^E>p'A»LGT3I TgWGKWAGAN 

201 M.LrEXi;,SVM r^CYESEXLS SJaSFNAQDXi DQESQSJJYOT DNSK3APLRL 

FX' IE 
251 G;DMPSV^/IT STSSQYFTLY VIIVSLLFVS EPMSJr^IHKK lEKMKPSIDP 

F E YXX X * X 

3 01 BDLGALTSIC TK^QQHHKLL XVLSXNKXFP ERCNXTNEVL CFIYLQGESW 
351 KGGE 



h3 li 



SQaK 



FIG 

S3 ^9 

X r-TDFSDFKTT KIPA-AEuD: LKRCVICKDw LNAPVTTQCD Hn'OSQCIRB 



SI ?1LH:?NRC?L CKTEVrESGL KRDFLLESrv^ ISVAaLSPHL LRLLSISKVE 
ICX SKC2V23EKS ANESALWGNR N'/T^'NDVDBr/ RVXDQLNADK LGEEKGCAfiK 

3 fs X 
151 ;/H0WEQT?3 VILLI SDCEH NGSDSLW.C? ICFSiOJEuBV LQGKHIODCL 

fs 

C C X ambiguities 

201 SGKSTKRTPT CZZS?^<r.P KQ IT SFFKP7 IDTCTPSPPT SKASTTPTAT 

S C N IK M 

251 PTTTuLKAlx^/ AS?S?VAQ$T VHXQXFLPKL D?SSL37QKI KAKLS3LKL? 

* T 
201 TTG3RN2MEA ?.VLKYYVI\^I A:n.DS>:H?V' 



2SJSB 



■m 

0 3 fs 0 A 

a w 3 « = 

1 VQFSSAWtS AVAOSAIAAV SNSTVTDrOT r^/TrTSCSS NKCHPTEVTT 

51 G'^r/rSVX-T WTTYCPLST TEAPAPSTA? SCEBOKCHSTT 

101 A^.TOSVrTVT EGTri'iTTYC PLFSTEAPGP APSTAEESKP AHSSPVPTTA 

151 AESSPAKrtA AESSPAQETT PKTVAAESS3 AETTAPAV3T AEAGAAA^UV 

201 PVAAGL-^LA ALF J 2 



X X 

t R£iSIISE£>: KIC-rPFFTIV ARIPVIEAFG FSEDIRKKTS GAASPgVvTL' 
51 GVDrCiOrOP? :v/PKT2E3:,E ZLGEFAEREW VARRi^flKIR RaKGL7\/DBJC 
X X 

= m 

/^o / 

^SS-wM ~ 

1 VMFXCQGSQS PET^TiEHLINI* IDSJGKIDfS Sr/S^SSHiC DGA^-VLVDW 

51 BG^/CSCTVKV LRQavi^KLK PLLVIKKIDR LITr/ftCtSE'L EAY^HISRII 

IT'l £Q^;NSVrGSP "A3D?.LSDJL KIVHEAGSVGS FIBXSCSDLY FTPEKMNVIF 

1£1 ASAIOGWAFS V^JCFAXZYLK JCuGrSCQALS KTLWGDPYLD MXWXKIIPGK 

2 CI XLW;XSNfl«:< FIF/SLILDQ VWAVYSK-O/I EPaV^jDKLSKr lEXLGAKITP 

XXX 
■ as 

251 KDLRSKDYKN llMLrMSQftrl PLSHAILGSV iSYLFSPrVA QR£P.IOKILD 
XXK XXXX XJCOC XX X ambig-iitiey 

aaa en«s3 83ss= -a a »a3 = = = s== = »se3a«aas3a»==»== = 

301 ETIYSAVDSB l^KSXLVDPS FVKAMQSCDS SHPfiTKTIAY VSKLLSIPJJ3 

ambiguities 

» aaa 

351 DLPKASNAAT GGLTADSI^E RGR:A5.EIiA:< XASEAAALAQ EGSX>r3DEFA 
X 

9 

iOl IKPKIOPFEW S?Z£DDrENE EDESDAIW,^ ESTSTIVGFr RTYSGSLSRG 
X S Y G N7R 

■* » X = mam 

431 QKLT/IGPXY :^?SjP?.:?HQT NFSQITNEVE IKSLFLIMGR SLVPwMEKVPA 

? *R V 

501 GNIV7/VGL3 NA'.XKNATIC 5PLP£DXPyr 5IZ;A£TST::IH NKPIMKIA'/S 
351 PTWPr:<rAKL e.^g:j 



17<^ CD 



1 ??:<^/AK5a3S tigkifrytf ytavisvigs agligyr.iye bsopvdqvkq 

X 

51 TP::.r?r:GSKK xtlvtz^gsgw ga:slikkld ttiy.wivs prLNYFi.rrpt 

is X fa £s 

101 LrSVPTGTV^ L?.SriEPVR6 VTRRCPGQVI YLEASATOIN PKTNELTCXQ 

151 STT'/'/SGHdS Ki:CSSSKSr/ AEYTOVESIT TTLNYCYLW GVGAOTItIF 
XXX XX XX X 

a m u as S! 

201 GNPGRPi-mK? I'PTTE^acsO Sh^ilK 



ISZGL 



1 MAKrlKAGlC/ AI\'V31GF.YAG XmiVKEHO EGTKSKPFPH AIVAGISRAP 
51 LKVrXKMDAK X^.TKRW/K? r^'KL'/NVKHIi MPTRYSLDV3 SFKSAVTSEA 



101 LEE?SQRESA KKVVKKAFEB KHQAGKirA'WF ^ 



V 

1 MSZPSTQVGF FiTCVASGtiV- XKDLPD^KPG AOOII^LXVOA VGLCHSDLKV 

£s 

X is D XXX 

- =3 » s a r 

51 LYEGZiDCGDN rvWGHalAGT -/AELGEEV'SE PA^^r-SVACV GPWGCGLCKH 

missing sevjfuence 

anbiiTJitiec X XXJi:<rFGXX£XX 
101 cltgki»a;ct /:s?LrwFr;LG y>'aGyE!Q?i:, v:<?.PEucLvTcr pdkvtsseaa 

miaalng sequence 

151 AITDA-.'LTrY HArxSAGVGP A3MILIIGAG GLGGNAIQVA KAcGAKVrVL 

missing Eequen-:;e 
201 DKKDKARDQA KATGADSVYS EwPDSVLPGS FSACFDrv'SV Q^TYDLCQICi 

missing eequencs 
251 CEPKG-nVPV' SMATS:-NIN lhi:LDL3.E17 VKGSrtfQ^Ui OLREAFELAA 

visaing seguencs 



301 OGK\'KP-WAH A=1SZIP:<YM EICHAGGYEj RWTKP 




409cip 



K V K E D 

ta s ~ a — 

1 .VXIvQAYDJT; VR3DFMATP/ aWSTGD-SSI. 
y P ? LSYD 

51 EDFVEHFTDG r.VQFGLAR'-T VPGSDySKNI LLGKCPOSAP AKLRCAPANN 
3 1 M M V a X 

3 a = —So r 

101 :>r/QITASDQ ODlC'/NErLN aVCAAAGARy STQTSGI/KK? 

151 SPAAPK?rSX P/VAK,?SSAS K?S?VPKSTG KF^TAPAKPK? KKI7KD?»G?."C- 
201 DABDVEERD? OKKPLK^PS AYK?C'm:iD BLPJCCK3DTT SSTPfCTrKSE 

? HD R T 

251 P^rEEKNDUDO CSICPLSSRKK AYDQPSSSDG RLTSLPKPKI GHSVADKYKA 

fs 

301 SASGfNGAAPA FGA?.?ArGTC S\^'SR:<DKLV GGLSRDFGAE NGKTPAQIWA 
331 EKRGXYKTVA JDiKECMSSS KVTEPSEKHA ADiAKKFESK ANrACDTTPSL 

K S fs X 

401 PT.^^LPPAPP A-^2TAX?S>;S ja:<£3K2EEE QAPAPSLPTR NLPPPSQRQ? 

fs X xs ndasinj sequence 

451 EPSPEPEEEE EEEESEAPAP SiPAFlTLPPA PKAEAEESKK Q5TTATAEYD 

501 YHKDEDNEIG F3E-GDLII3I EFVDDDWWQG KKAKTGEV'GL FPATYVSL 



1 VLGSC'WGDEG K.GKLVIJLuCD DIDVCASCgG GKNAGHTIW GrVKYDFHKL 
31 PSGLVNPKCQ ^'LVGS3^/^yIK VPS??ASI>HN LEAKGLDCRD RLFVSSRAHL 
A Z TX 

ICl VTDFHQRTDK LKEAELoTNK K3:G'?rGKCI OPTYSTKASR SGIRVHHLV?J 

isi p:?peawsefk thy:?,lve5h ^-.rygefeyd 



^ 9 coc4a 



f 

I EDKXQHFrAS C-ASA\':)DKTr. TA:i.RKKXKr NALWDDATN DDHSVI'TMSS 
51 NT'MSLLQLF? r-OTCVKOKK RK^T^-Llvlj; DDDMPIX?;>Jl Vh-HCmi;LK 
101 VJUiGDI'/r/K rCPDIKYA:.-?, IS'/LPIATTV EGIMGSLKDI. YLKPyfrVHAY 
151 P.?V?iCG!)L5T '.-^GGMRCVE? ??n;E^yTCSir AIVAQDIIXH CEGEPINRED 
K fs 

2 CI SENSLNE7GY DDI'JGCKX^K AOrRELVEL? LRHPQLmi GIKPPK5ILM 
2S1 YGPPGTC-XT: MAPAVAN'STG .s^-fPLrtCGFE IMSXMAGESE SJvXRKA?EEA 

fs ? f 5 ?f KX P 

301 EKNSPSIIFI j2II?3IA?KR VKTl^QSVERv WSOLLTIMD GMKA3R5K-.Vv' 

G L Pfs 

351 lAAmPKSI DPAlRPJJGRt* DREVDIGv^D AEGRLEILat KTKKMXLADD 
401 VDLEAIASST- .-IGP/GADIAS LCSEAAMCfOI REKMDLIDLE EETIDT2VXN 

4 51 SLGVTQDNra ?a:,g:is:^psa ItREtwe^v?; -.^r/^DDiGOLr niknelxstv 

501 EYPVLKPDQV CKFCUFCKG VLFFG?PGTG K'TLLAKAVA? EWSJC^FZS'/K 
551 GFSLLSMl'JVG ESiSr^ilRi:!? DXARA.\A??V \*TtVElSi$lA. KARGGSgCDA 
6"01 GGASDKWN'iJ LlTSUDGMMA KKNVFVrOAr NP^DCIDPAL LRPGI^LDQir 
651 r/PLFDSPAH LSILQAQLHK TPLHPGLDLi: aIAKITHCFS GADLSYn^gR 

Af3Q I 

7C1 SAKrAIKDSl EAQVKZNKIK EEX3:KVKTEr yi>:.£K'/D5V'3E EDPVPYITRA 
751 HFSSAWXTAK RoVSDAELP^^ Y5:SV;^QgtQA SRGCFSSFRF NENAGATDNG 
X 

m 

801 SAAGANSI3AA ?'S^. 



?^fl9 Bflrtl 



is 

1 TLKCRwESIl FAKAEK^/KQF KKEHOKTVIG 5VLX.E'2AVGG SGGIKGLWE 
51 GcVLPPIEGI RfRGaTI^Or QKEtPKAPGG EEPCPHALrW LLLTGE^T-TP 
101 AQTXALSESF AAP-5ALPXKV SZIillJRSPSK >:?KAQFSIA V7ALESBSQF 
151 AQAYAKGANTC «3VW:<y7YEP S:OLLAKL?T lAAKiywaT HDGKLFAAID 
X T X 7 
201 SKLDYGANIiA SLLGFGDNXE rVELMHLiXT IKSDHSGGtP/ S*\HTTHL\'GS 

RI * 

251 AliSSPFLStA AGLKGLAGPL HGK«JQ£Vu 



hj Si 



1 CRSFALKHM? UVHIFKuVS:: lYBVAPGVLT KHGKTKNPWP JWDSHSGVTiL 
5 

51 OYYCLTECSP i*r/L?GVFRA FGVjuPQLIZ^D iiGIOI-lPIER? XSrSTEKYIE 
101 LVKKIICK 



^0 



aabiguityx X • GL 

1 SDYHVIUXA?: RMT-IGTMEAEV RLYL:*VITLI ISPVSXIKFG ^/GAAREWPWQ 

51 VIYVGLGFIG FCy/ZCSIGSTS MSYU-IDAYPD IVIQGWGVS IXKOTIACIF 

101 TFACSYV;L3C SG?3NTYIAL siidfatiai VFPFLY^GK? frrktkrlw 

151 St^VELTOGM 



409e5 pqrtf 



1 I5QNNEDFIPG TZlilVStKVO SBDEK'/SKYD ASSKPJOTKTK QVIILFPCPS 

51 X:-::^S23Fri VIPITAFTAA TSNDAG.5IQD SliNEKYGISY 

101 DA2^>JTGAGU rLGIGWGTFF I^TPASSLYGR KITYPICZFL GLLGAWFATi 

151 \'KSrSDSIW3 iiryGISESC AEAQVQLSIS EL-iTAHNLGS VL75YmTS 

201 VG7YLC-?LIA A?ZVQKIGr« WVGWIAAIIS CAuLrvrvpC LDETVFDRAK 

251 FTKP 



iSEl 



1 OFCI/CDI-fc'K VSSr-CtVrGGFr SCrrT^DrJC'V enesk^wfhx fkqdlmkilk 
51 DCLTVSDDKS NIE.=IFLQFK: 7IV7CFYSME EYKYEIVDDI, IXFrTINMNS 
X * 

ici HGRivii?ciiV ^;t:n:?cr,riELi KKll0Kv^K^• kqrcdxoqqk qqqqqqqqqq 

ens 

QOQQ ta X airu:igx;i::ies 

151 XSNNSQHrVL :?:IAIIC5NTH SISRMPSIKM LLDLVKS-VW 

201 NKKKUJFVDK S^r^YY:.Zl^?S CDtlPSEfJRF KKLFESNHLW 

F 

m 

X anibiguitiea 
251 ZHZDYQDSlh CSriKSHlFV YIGHGGCDQV IKVSKLFKXC GTiWQDLLKKL 
501 ??SLLLGCSS ^/XI-CNCNYNY KSSMI-QPIG:! lYKWZJICKSS KirxaOiKD'/T 

D 

351 DKDIDIFTLS LLI^K^JGLIAU VKGSGHDYGM KXLDITNCW C'SRSKCTLXV 
4C1 IO?GSA?mG IPX 



190a3 



A 

i STKt;:KVK.*T FIS.^NLKPDL LKGIVAVGFZ TPSAigSXAX MQIISfSRDTI 

M 

a 

51 AQAQSr^TCJCT ATrSIGMI^ET^ lOrKSKSCQA uILSPTREIA 
ICl GDV1G;IKTHA CIGSfW^ED VKKLQQGCQI VSGTPGHVID VIKRRHLOTR 
151 NIIO/LILDEA DSLFTKGrKr QIYSiyKHLF PSVQ\Wv'SA TLPR2VLE^r^ 
fs T 

— s 

201 SK?TXrr/KI L\->:?J:EISLL <;iKQYrV»3CE REDWKFDTLC StYDITLTITg 
251 AVIFCNTrXK VIC^iCADyMKK Q>T?T'ArAMHC? DtOCQDERDSI MWDFKRGNSR 
301 VI^ISTOWX^ G:r/Q5VSLV IN-YDLPTDKE NYIKaiGRSG F.F'3RKGTAIK 

X 

351 LITiQDV/TL KSLEKVYSTK IKEMFMNIJH: IM 



4QC af 



1 2^A^J>TDIiri.T RFIigSC^Qf'^' ArTATGELSL LLMALQFAFK FIAHIvflRRAE 
51 LVNI-IGVSGS AKSTGDVCrCy LI.'VrGDEIF: NAI-!RSSNHv'X VLVSESQEDL 
101 r/FPGGOrVA vrTTPIDGSS NIIlAGr;SVGT IFG\'VKI.QE3 STGGX3DVLR 

iri ing f eouence 

151 ?G:<SMVAA3y TMYGASAHIA LTTCHGVMLF TLC70LGEFI LTHFfJLKI^PO 

7u.ssing sequence 

201 ?KNIYSLN£G VSTCKFPEYVQ DY^iKDIKKBG YSLHYIGLI^^/ ADVHF-TUYG 

missing sequence 
251 GIFAy?T:»XL ?.%XYE:fPMA LLMEQAGGSA VTIKGSKIiD iljPKGIHDKo 



mi3sing aecJuence 

301 SIVLGEXGEV ZXVCKH^'PK 



0 360C6 



1 DWSSTSTAE A'.T^A'ZIK'v'ZO EFPQES3Affr SLEDKI^/SAV :-3IIlMC?LI 
51 AFGGFVFGFD ?GTISGriNM SDPLSRFGGT KADGTLVFSW VRT^LMIGLr 

101 NAGCAIGAl? ISr/G3MYGR RVGIMTAMIV YIVGriVgrA SQKAkVYQVMI 

airbiguities 
X 

« 

151 GRIITCIiA'A? >::,SVLCFI.Fr SEVSPKKLRG TLVCCPOLMI TLGIFLGYCT 

f5 
a 

201 r/GTKSySDS .^QW?.I?IX3IC FAVJEALCLVAG tC^K}iP2S?^Y LVGKDRIEDA 

?R 

ss 

251 KMSIAXTNKV SP33?ALYR2: LQirQAC^TSF. HIUAGKASWG TLFUGICTKIF 

'V ::!as3inff setjuence 

301 SRVMLGT.-MLQ ALSQFNTrGKN LFPSYL?SXP K fTC 7t 



missing sequence 

1 NArVSGCITS FU'DTOAr/S VGQEirEOlEE GDAPAGGA5A SEAPAKXEEA 

ml aeing sequence 

51 PEXAXEBSA? AA^-PFJCESTK KSEPKX23KP APKKSESXXS TQSrrSAPTF 

101 ra7SPi^£HRV KM::.V':RLR:A EKLKESQNTA ASLTTfMEVD MSNIiMDFRKK 

missing secyjer.ce 
151 yXDEFIZKTG IKL3FMGAPS KASAiALKSI PAVN-UIEN^I DTLVFKDYAD 

miasing seg'jenca X XX XX NX * 

201 ISIAVATPKG L'/TP'h'^/TCIAS 5iSrLGIEKE ISJiLGKKASC GKtrLEI»lTG 

S X XX X C X X^ X F XF X IX 

251 GTFTISNGGV FGSZYC-TPi: IvXPQTAVLGIi HGVKZHPVn' NGQXVSRPMW <^ ^ 

301 YIALTYDHRV VDG^SAVIFL PiTZXillEDP P.KMLL 



gf 



1 EKMXSGMIV:: CLCK^-Er?R MSCPJYMLl'V GSFRTaCYXT rQEFLDAKYK 
51 HLESFIGHVG iQ}.^KZV?Zr rDYOFt/TYLG DNr^'SNYNLT ILWSiVRlG 

£3 

iCl bI?:\7:CG?N^;i IlTP-XHrVDP Tl-RYD^I-EMA LPVTrz-GNOW l-CGSCTILGG 

X missing aegpaencfi 

151 V7r«-G3G3IVA AGA-/'.mDv'P PNTvVAGv'PA RWKQLEPPa) PHF 



S5q1 



XX XXFY X ? X X 

MM aasss — = "7 

1 TSDTKTKQFi: >1Bhi:<DlSii'i GGHLKTVPRS 5SSSSSQKKK SSiOCQRHNCE 

T HL V L 

31 DDSENGOTEG fLDASSS.^KZ LQ'uAKEQQDS LBQEDSIQNX P3FAQSFK.VQ 
: L li FI S 

ICI OIDSEESESS D£V5DrSB2^ HVEEIT/rJES CPJr'/DPKDAE LrNlCrFCSKG 

D FX R X P X F 

151 EA^JSXDDDNS F^KINLADK ILAKIQEKES SQ0Q0QCS3? CN3HEDAVLL 
E : 

2C1 PPKVIIAYKK lO^ilhSTiTH GZLPKlFjCIL PSLKKWQDVL YVTNPNSltfTP 
251 KATYEATKXiF vSM/SSNEAT VFTZTILtPil FPXSIENSDD HSUWilYRA 
301 liKKS:*yKPGA FFKGFLLFL^' DGYCS\'F.3:AT IAASVLTKVS VPVLHSCKYC 
351 G'/LMNKK?.E5 FVFVLRF.I 



f'3 ^ 



# 3g 

: >r/VyKHDGA:< iPVRFDKITA KVORLCYGLN Pl.WEPyAIT QKVIS^/VQG 
3: VTTIELDXIA AIIA-^TMC^X KPOVAVLAAR lA'/SN-i^HKQT TKQ'<r5KV$JC: 
101 LVEyiNPKTG IHSPMISKET HMIllEHSDE LNSAIVYDRD 5^mjyrcFKT 
151 LERSYIXRli; GK'/AERPQKI, IMRVAVC-ZHG KDlPRVIEr/ hJLMSQRFPTH 
2 31 GSPCtFNAO^ PP.PQMSSCPL LAMKDDSIEG I^DTLKSCAL ISKSAGGXOt 
251 HlrailRSTGA YliSTNGTSia GIlPlfTIlVFN N'TARyVDQGS t^RPGAFAl^V 
3G1 LEPWK3DIFD FI0XRKNHG3C SSaRARDLFP ALWIPDLFMK ft«/SONGDWTL 
351 FSPKEAPGLA DXn^GDEFEEL yTXYEKSNRG P.QTIKAQKLW VATLGAC'TET 
401 GTPFMLYTOS CC^TCSKgKNL GIIKSSNLCC EIVEYSAPDE VAVCNLASIA 
45i IjPSFV2ND3K STmTDKLH !2VT:<V^/^?w^^i, imvZDS^HYP VPEASRSKMR 
501 KRPIAIiGVQG lATAFMELRL P?DSQSA?3L KIQZPETIYH AA^/HASISLA 
551 KESGAYST/P GSPA5CGLLQ FDLW^KFTE LWDWDTLKQD lAKHGMRNSL 
601 LVAPKPTAST S^ILGNtJECF SPYTSNIYSR RVtAGEFCIV NPYI^tKOLVD 
651 iCm^DAJIKS SirAlJKGSIQ ALPMIPDEIK AL/KTVWEI5 QKHIIDKAAD 
701 RAAFIDQSQS LNZHIJCDPTM GXtTS^lHFVG WKKGLKTGMY YLRTQAASAA 
751 IQrTID^KIA ErAGHTVA-NX nONIXKrv'N XGRVSSENTiJ DAPYKSFSTE 
8C1 PTSLSSSVAD LK.IXDEGEKP AEDKTIEEbE raiYSAKVIA CAIOTPESCT 



831 MCSC 




fs X ta V 
L APKVjYQSEDV PAPKCCRICTA ^IPCKI-RASLA PGrvtiltLAG HFRGKRVWX. 
31 mEDNTLLV 3GPF?/A^/P :,R?^/NA?.WI ATSTKVMVSG VDVSKnWZY 
lO: FAREKSSKSK XSEAiFFN-fiS QFKKEIKAER VAD^KS-ZDAA LLSEIKKTPL 
151 LKQYLAASrS l/^WrRFKLL KF 




% 32B03 



1 MSNDKGCLVE LYVPRKCSAT h-RIIKAKDKA 3VQISIAKVD BOGRAIAGEK 
5: ITYALSG-A-R GRGSADDSL^J P.LAQCDGLLK NV^'/SVSX / A Q 

fiZoL&arll ~ — 



X 

1 MSPKGFKKG7 LRAPOTMSO:-: rNMGElTQDA WLDAERIU^K EIEMETKKL3 
51 EESKKYFNAV NCrMLDEQIDF AKAVAEiriCP ISGRLSDPSA TVPSnNPQGI 

P 

s 

IGl aASZLYC^W !01KDT1KPD LELIEKKK^ LAQSI-LKIia AIRia« 



/o/ 



P7qt Mrtt 



A 

1 HKWIFSKVE!. KKERWXDEH KMFSAQTBl'E lAQQEyDYVN DLLKNELPVL 
Y y ID 

a « as 

51 FQMvSDFIX? Lr.'Sm'MQL NrFVTLVTFJl SELKIPYFOL STDIVEAYTA 
S fa fs K 

101 KKGNIEEQ?:* AIOITHFKVG KAXSKiSATK RRHAAJ^NSPP PTGASSIAST 
151 CTGGELFAVS PGGyrJQPYGD SXYQPPSSPA TYQSPVVAAT AQSPATYOS? 
2Q1 VATGQPFSVL PQTPASAPPP Q^*GSGLPTCT ALYDYTAQAQ GDLTFPAGAV 
251 IEI:QR72DA SGl-JV-TGICYNG QTGVFPGNYV QL 




1 M'ZTS:<STfLF TSESV>:SGH& DKICDQVSDA II^ACXAV'DP LSKv-ACSXAA 

51 KTorirw/rGs iTTKAQLDrQ k::fj)ti:<ki oyppsekgfd y;c^cNvi-Ai 

101 EOQSPD:aOO LHYEKAISSL SAGDQGIMFG VATDSTrEKL PiTItLAKXL 

151 :CAALASAR}?S 33t?WlRPDT KTQVTISYSK DGGAVIPKRV ::TIVI5TQKA 

201 EEirrmRK EriSniiKgv ipehll:;dkt iykiqps<?rp vigopcc-cag 

251 LTCRKIIvT? YGGtV^JAHGGG AFSGKDFSKv DRSAA^-JURW VAKSLVTAGL 

301 AK.^VC?SV AIGVAEPTSI Yir'TVGTSXL :5THAL7E:IK NNFDLP.PGT/I 

351 iKSi^LAF-Pi Vf K?AsyaHr tkqei-isv;eqp KKLKF 



1 FraSADAKiT JXHIIKGIKLY GKAiKLKKID AXSQSST:^F KXQTIGr-VQ 

51 sD:,iNP!rs:'iD v3a:<lf:n^x :j?lvdssflm D?FSKFc;Tir rnpiiprdsb 

T ETlaaing seq^aence 



101 GHStG^'SFLT* Y:iDrES5DLC rQKMCTILM KrAlAlSYAF XDLSVOOKKS 




232c qq 



151 PHGD-fiVERXZ, GX^Cv? 




V' 



aiRbiguirie3 L XP w L S M 



1 CPFLTEISLH LIQ/HQH^UZA TIKy/UIXED FAIIILMDHF £C<;DurTNII 
S X X 
31 DRQ1FTNK5H HKVPF?DFET QLl-l-OCNAMLQ LIEAISYCHE MKIYHCDLK? 

101 HNihr.ravriPv r\'?.?Titnmi nngeotlcva rjsiicYNELK lviidfglam 

151 DSATICCN5C KOSSFYMAPE RTTrmFTHRI.- INQlirMTJQY ESIEXNGTr/ 

F V 

s e 

201 TKSNCKYL?? tAGDIWSLGV Ii?IKITCSRr4 PWPIAS?DNN 
L V - L D D 

= a s r SB 

251 >;^7JrJKA2LSX riPlSSQFNP ILDRI7KLN? NGRIGL 



/o 



1 SGTXQCXT5I IK^-MKADr.? KCVEWICPP ALYLSLAVBQ MKOPTVAIGA 

X 

51 QNl'F^KSCGA FTSETCASQI LDVGASV/TLT GHS2RRTIIK ESDEFIAEKT 
X XX GCWFQDXX XGX 

= c — ===saB9aa saw 

101 KFAL3TG'/KV ::,CI3STIEE RKGGV7LDVC ARCil^AVSK: VSWSNIWA 

.Tvissiny satr^ence 
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BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the original 
documents submitted by the appHcant. 

Defects in the images include but are not limited to the items checked: 

1^ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

1^ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 



IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 
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